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Output Rates Among Butter Wrappers: I. Work Curves 
and Their Stability * 


Harold F. Rothe 
Stevenson, Jordan and Harrison, Inc., Chicago, Illinois 


The importance of production data in industrial situations has long 
been recognized. This is true whether these data are used in the analysis 
of engineering problems or of behavioral problems. As Burtt has 
written, production is the most obvious criterion to use in validating a 
selective personnel test (3, 173). Muscio has stressed the use of work 
curves in studying problems of “industrial fatigue” (8). Production 
data, in one form or another, have also been used in analyzing the effects 
of variation in illumination, atmosphere, wage systems, work methods, 
training methods, and so forth. 

Despite this rather wide usage of output data, very little research has 
been published on the various problems of this criterion. For many 
industrial plants little or nothing is known about the pattern, distribu- 
tion, and stability of rates of output of various employees under various 
environmental conditions. Because so little is known about these three 
phenomena it is impossible to generalize about them to any great extent, 
or to make predictions about any given new situation. This is a serious 
obstacle in the development of a scientific industrial management and a 
scientific industrial psychology. 

This paper reports an investigation that was made of these problems 
in a particular plant for a specific type of operation. The data were col- 
lected in an industrial situation and they were analyzed according to the 
methods of experimental psychology. The findings in regard to the 
patterning of rates of output and its stability are presented here. The 


* This study was made in partial fulfillment of the requirements for the degree Ph.D. 
in psychology at the University of Minnesota. The writer wishes to thank his co- 
chairmen, Professors Donald G. Paterson and Miles A. Tinker, for their many helpful 
suggestions.. He is also grateful to Mr. John Brandt, President, and the employees of 
the Minneapolis plant, Land O’ Lakes Creameries, Inc., whose cooperation made possible 
this investigation. 
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findings in regard to the distribution of output data will be reported in a 
second paper. 


Pertinent Literature 


Before discussing briefly the pertinent publications in these areas it 
might be well to define the terms “pattern” and “stability” as they are 
used here. The words “pattern” or “patterning” of output data indicate 
the distribution of production rates in time so that when they are pre- 
sented graphically the result is a so-called work curve or production 
curve. The term “stability” as used here is analogous to the term 
“reliability” as that is commonly used in psychological work. Thus if an 
operator shows perfectly identical work curves for two different days the 
curves are said to be “stable’”’ (within the limitations of the two days). 

There are two rather general concepts of work curves. Burtt’s so- 
called “typical daily work curve” is so frequently illustrated in textbooks 
that the impression is often left with the reader that this is the only 
shape in which work curves are found (2, 154). Indeed, approximations 
of this curve have been found as by Goldmark and Hopkins (9, 429), and 
by Polatov (10). This work curve (sometimes also called “fatigue” 
curve) has been found most often in heavy operations. It is found when 
some kind of production data for many operators are lumped together 
and there is but little published evidence of this “typical” curve fur any 
one operator on any one day. 

The concept of a typical “monotony” curve has been suggested by 
Wyatt (14). This curve is most frequently associated with light repeti- 
tive operations and reported feelings of monotony, as by Marsh (6) and 
by Wyatt, Frost, and Stock (15). It, too, is based primarily upon group 
data and there is very little indication that it is found for any one operator 
on one day. 

The Hawthorne investigators reported a few daily work curves for 
some of their relay assembly test room operators and these curves did not 
appear to fall readily into either of the above two classifications (11, 121). 
They appeared as plus and minus variations about a straight line parallel 
to the abscissa. Composite curves of this straight line nature were also 
found by Wyatt, Frost and Stock (15). 

The stability of industrial work curves has been investigated very 
slightly and apparently by inspectional methods only. Daily work 
curves specific to the days on which they were made by individual 
operators were reported by Vernon (13) and by operators singly and in a 
group whose data were considered together by Kunst (4). 

The questions of whether work curves are specific to the type of 
operation and to the methods of payment were investigated by some of 
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the English students. Burnett found average work curves to vary 
separately with method of payment (piece rate and time rate) and also 
with day of the week as well as to co-vary with these (1). Wyatt, Frost, 
and Stock found average curves to vary with method of payment but not 
with type of operations (15). 

In summary: (1) individual and group curves may take any of a 
number of shapes; (2) there are few published data regarding the shapes 
of the work curves of individuals operators on single days; (3) work 
curves, for both individual and groups, may vary “capriciously” from 
day to day, may vary with methods of payment, and may vary more with 
variations in payment than with variations in the nature of the work; 
and (4) in the light of the above, one is treading upon extremely hazardous 
ground when he attempts to predict anything about the pattern or 
stability of work curves for any individual or group in a new situation. 


The Present Problems 


The relative paucity of industrial data indicated above presents a 
severe problem to the industrial psychologist studying the effects of rest 
periods, music, etc. in factories and offices. Although he may wish to 
use work curves as a validity criterion of his work, he often cannot know 
how much data to collect to insure a stable criterion. Frequently, too, 
his opportunities for collecting data will be seriously limited. It is 
desirable, therefore, to find further facts regarding the pattern and sta- 
bility of output data in order to answer some of his questions. 

In the present study it was possible to obtain data on some industrial 
operators performing a light, repetitive, manual operation. The prob- 
lems, then, were to determine the patterning of these output data and 
the stability of this patterning for the operators individually and as a 
group. 


Conditions and Methods of the Investigation 


In an experimental investigation it is possible to control many vari- 
ables such as time, place, work method, and others. In an industrial 
investigation this is usually impossible and the investigator can only 
observe and describe the conditions. The latter was true of this study. 


The study was made in the print room of a Minneapolis creamery. Data 
were collected during the “regular” working hours from Monday, March 15, 
1943, through Monday, March 29, 1943. The hours of work were from 7:30 
A.M. to 4:30 p.m. on Monday through Friday, and from 7:30 a.m. to 12:00 
noon on Saturday. There were daily rest periods from 9:30 a.m. to 9:40 a.m. 
and from 2:30 p.m. to 2:40 p.m. The lunch period was from 11:30 a.m. to 
12:00 noon, daily. These hours represent some overtime because of war 
conditions. The operators did not know, on any day, how late they would 
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work that afternoon until shortly before quitting time. They actually worked 
the above hours on each day during the study. Weekly pay was by time plus 
overtime. 

The temperature of the room was measured several times during the study 
with an inexpensive thermometer. In general it varied between 66 degrees 
and 74 degrees Fahrenheit, and it rose slowly but steadily during each day. 
The humidity was not measured but was necessarily somewhat high because 
of the nature of some of the operations in the print room. The illumination 
was furnished by means of windows and skylights and also artificially. The 
foot-candle illumination at the work-table was taken several times and was 
found to range from 7 to 13 f.c. This was more than sufficient for the type of 
work performed (12, p. 19). 

During the two weeks of the study seven different operations were per- 
formed. Of these, one was by far the most common of the seven. This 
operation consisted of wrapping, by hand (to supplement the wrapping ma- 
chines in the room) quarter-pound blocks of butter. These were taken off a 
moving belt, wrapped in single vegetable parchment wrapping papers, and 
replaced on a higher belt. Close observation revealed that there were so many 
slight variations in techniques among the operators that a detailed motion 
analysis was unfeasible for the present purpose. 

The butter was wrapped on a table, the top of which was 32 inches above 
the floor. The operators sat in adjustable chairs along both sides of the table. 
These chairs were movable and no attempt was made to have each operator 
sit in the same position each day although they tended to do this. 

Cenco counters were placed on the table before each operator. Each 
counter had a small arm which the operator depressed whenever she placed 
one pound of butter (wrapped in four quarters) on the upper belt. An assist- 
ant went around the table, in the same order, once each fifteen minutes, and 
recorded the counter readings. On occasion other readings were taken. The 
investigator sat at one end of the table where he could see all operators and 
record all activity other than butter-wrapping. Thus a measure was obtained 
of any inactive time within a fifteen minute period. The data were adjusted 
for such periods by dividing the output for the period by the number of active 
minutes (within five fee sem, B and multiplying os fifteen. This was done for 
all periods of inactivity under five minutes in length in any fifteen minute 
period. This involved the assumption that the work within any fifteen minute 
period was spread evenly over that interval for each operator. The use of the 
counters required the assumption that the operators’ working habits were not 
seriously impaired. This was probably partially true because the manage- 
ment had previously, on occasion, used these counters for a somewhat similar 
purpose. e data for all fifteen minute periods in which the operators were 
inactive for more than five minutes were omitted from the analysis. 

The operators who were observed in this study were regular employees of 
the creamery. All were experienced at hand wrapping. All were women, 
ranging in age from 18 to 39 years, in education from 8 to 12 years, and in 
length of service from one month to eighteen years. All except the one with 
one month’s service were Union members. Data were collected for 16 oper- 
ators. Many of these were incomplete because of necessary job shifting, i.e., 
changes in assignments. The above personnel facts, and all of the production 
data discussed in these two papers, were based upon the eight operators for 
whom complete analyses were made. 

It is unnecessary to dwell upon the variables that were not controlled in 
this investigation. This is properly spoken of as an investigation and not an 
experiment. The following uncontrolled sources of variation may be briefly 
noted: relatively coarse timing, non-mechanical production counting, shifting 
numbers and positions of operators along work-table, presence of investigators 
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and other visitors to the print room, rate of flow of butter along belts, hardness 
of butter, atmosphere, illumination, humidity, methods of wrapping, effects of 
shifting operations, age, experience, and private lives of operators, amount of 
talking, and necessity for substitute research assistants in last few days of the 
study. The influence of any and all of these factors can only be surmised. 
The data used in this study for an analysis of the pattern, distribution, 
and stability of the output of industrial workers under “industrial conditions” 
were those collected for eight operators only. The data collected on the other 
operators were considered too incomplete to permit of extensive analysis. 
hese data, as mentioned above, were also “adjusted” to furnish ‘‘complete”’ 
fifteen-minute intervals of production in those periods when the operators were 
away from the table or otherwise not busy. 


Results 


Daily work curves were constructed for each operator for each day, 
in the following manner. Units of time, both hours and days, were 
indicated on the abscissa, and rates of production expressed in terms of 
the number of pounds wrapped within a fifteen minute period, were in- 
dicated on the ordinate. Differences in operations were indicated by a 
code; rest periods and lunch periods by ordinates erected from the base 
line and one curve for each of the entire thirteen days was made for each 
operator.! 

Inspection of these curves by the writer and by the members of his 
Ph.D. thesis committee of the Graduate School of the University of 
Minnesota, indicated that they took many different, and probably no 
characteristic, shape. In some instances they resembled “fatigue” 
curves; others resembled ‘‘monotony” curves; and the greater majority 
of them were more or less “straight-line” curves. There was nothing in 
the data to permit any prediction as to the kind of curve that might be 
expected for any individual on any day, with the exception of the facts 
discussed below in the correlational analysis. The curves for each 
operator varied among themselves and did not appear, by inspection, to 
be consistent from day to day. 

A so-called group daily curve was constructed for each day in the 
following manner. For every fifteen minute period throughout the 
study, the available production rates of all operators were listed. The 
median for each of these periods was used as the output rate for that 
period on this group curve. Thus each point on it is the median of eight 
individual readings, except for those periods when some of the operators 
were not working. At no time were less than three readings used in 
establishing this median. 


1 These original curves and all raw data have been placed on file with Dr. Miles A. 
Tinker in the Department of Psychology at the University of Minnesota. Photo- 
static copies of the curves are in the writer’s thesis on file in the University of Minnesota 
Library. 








204 . Harold F. Rothe 


This group daily work curve likewise failed to show any consistent 
common patterning, by inspection, with these two exceptions: (1) the 
straight-line curve was again the most common, although “fatigue” and 
“monotony” curves were also in evidence; and (2) rest periods were fre- 
quently followed by a lowered production and lunch periods by a higher 
production. 

A “daily trend line” was established for each operator. Each operat- 
or’s median output rate for every fifteen minute interval, regardless of 
days, was determined, and these medians were connected to make an 
“average” or “trend” line that was assumed to be independent of any 
influence that could be attributed to particular days of the week. These 
trend lines were also analyzed by inspection. Those for two operators 
resembled the “typical work curves” or “fatigue” curves. Two others 
showed what might be considered ‘‘monotony” curves, and the other four 
operators had “mixed’’ trend lines. One characteristic of all trend lines 
was that the lowest point for every operator occurred in the first fifteen 
minute period of the day, and the lowest prolonged period (plateau) 
occurred in the late afternoon. This latter might be considered evidence 
of “fatigue,” or better, could be defined as fatigue. 

Group trend lines were also established. Two of these were made, 
using two techniques. The first was to “collapse” all of the group daily 
work curves on to one curve, thus eliminating the effects of variations 
among days. The second method was to locate the points from each of 
the individual trend lines onto one graph and to connect the medians of 
these for each fifteen minute interval. These two group trend lines 
showed the same general characteristics, as might be expected. They 
both began low and rose steadily in the first morning work-spell, levelled 
off in the second morning work spell, began at a lower level after lunch 
but rose quickly until they reached a peak for the day just before the 
afternoon rest period, and then showed a “‘dip,”’ resembling a ‘““monotony”’ 
curve, in the late afternoon spell while remaining at a generally lower 
level in this last period. 

In summary, the individual trend lines varied from operator to opera- 
tor, taking several general shapes. The individual and also the group 
trend lines showed a few common phenomena such as an early morning 
warming-up period, a tendency toward a general decrease during the 
day, and an apparently relatively inefficient spacing of the rest periods. 

In addition to the inspectional analyses described above, the work 
curves were analyzed by correlational techniques. The method was, in 
general, to correlate one curve with another curve, obtaining Pearson 
r’s. The paired variates were the two readings for any given fifteen 
minute period. 
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Although no correlations were computed where there were not at 
least 18 paired variates, most correlations involved about 30 sets of 
readings. In order to minimize the effects of these small samples, the 
practise was to determine a large number of correlations dealing with 
each question (as far as was possible) and then to consider the distribution 
of these coefficients, rather than to attach much significance to any one 
of them. Further, in determining these correlations the data for five 
days only were utilized.“ On these days all of the eight operators per- 
formed the same operation (wrapping quarter pound blocks of butter) 
all day long. These days were Monday 2, (the second Monday of the 


Table 1 


Distribution of Correlation Coefficients between Work Curves of Different Days 
for Same Operators 











Range of r Frequency 
60to .69 2 
50to .59 1 
40to 49 5 
320to .39 5 
.20to .29 5 
10to .19 8 
OOto .09 ag 
—.01 to —.10 12 
—.11 to —.20 7 
—.21 to —.30 4 
—.31 to —.40 4 
—.41 to —.50 1 





* Indicates location of median of distribution. 


investigation), Tuesday, Wednesday, and Thursday (also all of the 
second week), and Monday 3. 

The first correlational problem attacked was that of the stability of 
an individual operator’s daily work curve from day to day. This is 
analogous to the problem of test-retest reliability. To answer this 
problem, each operator’s work curve for one day was correlated with her 
work curve for every other day. There were ten such pairs of days and 
eight operators. On a few occasions some operators missed some of 
these days arid hence only 61 (rather than 80) correlations were obtained. 
The distribution of these correlations is shown in Table 1. 

Thus, using the median coefficient of the distribution in Table 1, the 
inter-day correlation between the work curves for any one operator was 
negligible (approximately .05). 
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To ascertain the possible presence of any sort of group social inter- 
acting (i.e., talking, common reaction to events or days of the week, etc.) 
a series of coefficients was determined for the work curves of individual 
operators on common days. That is, the correlation between Operators 
1 and 2, both on Tuesday, etc. It did not appear necessary to correlate 
all possible combinations and a sample of 35 combinations was selected 
with the aid of tables of random numbers (5, 262 ff.). The distribution 
of the obtained coefficients is shown in Table 2, where the median is 
shown to be about .27. 

From Tables 1 and 2 it may be concluded that the work curves for 
the various operators showed a slight tendency to take the same shape on 


Table 2 


Distribution of Correlation Coefficients between Work Curves of Different 
Operators on the Same Days 











Range of r Frequency 
70to .79 1 
60 to .69 0 
50 to .59 5 
40 to 49 7 
30to .39 3 
20to .29 8* 
10to .19 4 
00 to .09 1 
—.01 to —.10 1 
—.11 to —.20 3 
—.21 to —.30 0 
—.31 to —.40 1 
—.41 to —.50 1 





* Indicates location of median of distribution. 


any one day while the shape of any one operator’s work curves bore no 
relation to each other from day to day. This finding, here by correla- 
tional techniques, is in accordance with the conclusions of the studies 
described by Roethlisberger and Dickson (11), Wyatt, Frost, and Stock 
(15), and Mayo and Lombard (7). 

In order to test further the inter-day relations among the work curves 
for each operator, another series of correlation coefficients was obtained 
in which the curve for one operator on one day was correlated with the 
curve for a different operator on another day. Thirty-five such pairs of 
curves were selected, again with the aid of tables of random numbers. 
The distribution of these is shown in Table 3. 

The median of the distribution in Table 3 is .05. Thus, as can be 
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Table 3 
Distribution of Correlation Coefficients between Randomly Paired 
Individual Work Curves 
Range of r Frequency 
60to .69 1 
50to .59 2 
40to .49 3 
320to .39 4 
.20to .29 1 
10to .19 6 
00 to .09 5* 
—.01 to —.10 7 
—.11 to —.20 2 
—.21 to —.30 2 
—.31 to —.40 1 
—.41 to —.50 1 





* Indicates location of median of distribution. 


seen from Tables 1, 2, and 3, an operator’s work curve for any one day 
bears little or no relation to her own curve for any other day or to the 
curve of any other operator for any other day. But when the work curves 
for different operators on any one day are considered, there is some rela- 
tionship. 

Using these same correlational techniques the group daily work curves 
were also analyzed to determine if they were stable from day today. It 
would be anticipated from the above that this might be true. The 
matrix of these coefficients is presented in Table 4. 











Table 4 
Correlation Coefficients between Group Work Curves for Different Days 
Tuesday Wednesday Thursday Monday 3 
Monday 2 .22 34 31 .08 
Tuesday .26 .53 — .09 
Wednesday 50 .34 
Thursday — .22 





These coefficients range from .53 to —.22 and the median is about 
.30. Thus the group as a whole tended to show the same work pattern 
from day to day, although the correspondence was not very high. These 
correlations were highest between Tuesday, Wednesday, and Thursday. 
It is possible that the relatively peculiar work curves for Monday 2 and 
Monday 3 might be functions of the week-ends or of week-ending. The 
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data for Saturdays have been omitted from all correlations because of the 
small number of available readings. 

In a similar manner, the individual trend lines were also correlated, 
and since the group daily curves were related, it appeared probable that 
the trend lines which amounted to “‘averages” for individuals over a 
period of days might be related to an appreciable degree. This was 
found to be so, as shown in Table 5, where the range of coefficients is 
from .15 to .70, with the median at .51. 











Table 5 
Correlation Coefficients between Trend Lines for Each Operator 
ae 
umber 2 3 4 5 6 7 8 
1 68 .39 .62 .70 51 61 15 
2 27 59 54 51 AT .33 
3 .66 57 48 33 55 
4 59 .60 .63 43 
5 A7 .62 49 
6 52 43 
7 42 





The distribution of these coefficients is shown in Table 6. This 
distribution is skewed as would be expected if there were actually some 
positive correlation. 











Table 6 
Distribution of Correlation Coefficients between Trend Lines for Each Operator 
Range of r Frequency 
.61 to .70 7 
.51 to .60 9 
Al to .50 7 
.31 to .40 3 
.21 to .30 i 
-11 to .20 1 





It appears, then, that the different operators tended to vary together 
in production pattern when a fairly long period of time was considered, 
i.e., several days. The reasons for this could not be determined from the 
present data, although it is probable that the group interaction or the 
group common reaction mentioned above may have been influential. 
Another possible reason for this tendency toward higher correlations 
among trend lines than among daily work curves is that the trend lines 
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are more “stable” or “reliable” than are the daily curves. It is also 
possible that these inter-individual trend lines are higher than are the 
correlations among the group curves from day to day because the trend 
lines are constructed from data covering several days and hence include 
inter-day variation as well as intra-day variation. The group daily 
curves include inter-individual but not inter-day variation. 

All of the correlations that have been mentioned tended to be positive. 
Even those distributions with medians of approximately 0.00 extended 
further into the positive values of r than into the negative values. This 
suggested that the inter-correlation between the two group trend lines, 
both based on essentially the same data, but derived by different methods, 
would be high and positive. A coefficient of .87 was obtained when the 
two group trend lines were correlated. 

To the extent that this coefficient of .87 was roughly predicted from the 
other coefficients, it serves partially to validate those others and to indi- 
cate that they are probably rather accurate estimates of the inter-rela- 
tionships that actually prevail. And to the extent that the group trend 
lines were established by different methods this latter coefficient suggests 
that the situation they represent is a fairly stable one, and that different 
methods of treating sufficient data of that situation give essentially the 
same results. 

In summary, the following correlational results may be tentatively 
presented: the correlation between an operator’s work curve for one day 
vs. her curve for another day is about 0.05; the correlation between one 
operator’s curve and the curve for any other operator on any different 
day is also about 0.05; the correlation between the curves for any two 
operators on any one day is about 0.30; the correlation between any two 
operators’ trend lines (covering several days) is about 0.50; and the 
correlation between group trend lines for several days and established by 
different methods is about 0.90. 


Summary and Conclusions 


The conclusions to be drawn from this investigation must be con- 
sidered tentative rather than definite, because of the serious limitations 
imposed by the small number of operators, the restricted nature of the 
operations, and the short period of time involved. Within these limita- 
tions the following general conclusions may be drawn: 


1. Individual daily work curves for this, and probably other, indus- 
trial jobs involving light manual skills, may take any of many different 
forms and do not assume any characteristic, predictable pattern. 

2. Individual daily work curves are, as the name implies, specific to 
the individual and also, but to a lesser extent, specific to the day. They 
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are correlated with each other only to the extent that the early warming- 
up period is a common phenomenon. 

3. Individual trend lines, based on the data of several days, are more 
highly related among different individual industrial operators then are 
individual daily work curves. 

4. Group trend lines, regardless of method of construction, and within 
the limits used in the present analysis, represent a stable phenomenon. 
It is to be expected that group trend lines based upon samples of different 
periods of time, but on the same operations, would be intercorrelated 
rather highly. Thus group trend lines should be used when work curves 
are used as criteria against which some variable is to be measured or 
validated since they form stable criteria. 

5. The correlational technique applied to work curves is one that 
may well be applied more widely in future industrial research on work 
patterning. , 

6. Industrial management, in studying the effects of rest periods, 
music in factories and offices, illumination, etc., by analyzing the effects 
of these variables upon work curves, would be wise to collect data cover- 
ing several different operators and several different days in order to 
establish a “stable” work curve of output. If this is impossible, it would 
be next most valuable to obtain data on one operator over several days 
and, as a third choice, to collect data on several operators on any one day, 
and thus construct a reasonably accurate estimation of the work curve for 
the operation under consideration. There is little or no value in collecting 
data for one operator on one day only. 


Received June 25, 1945. 
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Management’s Reactions to Employee Opinion Polls 


Robert N. McMurry 
Robert N. McMurry & Co., Chicago, Illinois 


Ordinarily the public thinks of labor disputes as having their origin in 
such major issues as wages and hours. This is not strictly true. While 
they serve to rationalize attacks on management, the true causes are the 
miltitude of trivial annoyances to which the employee is subjected day in and 
day out. The chief reason why they are allowed to continue and to create 
worker hostility is the fact that management rarely knows of their exist- 
ence,—hence does nothing about them. 

One of the best tools for discovering the true sources of worker dis- 
satisfaction is the so-called “Employee Opinion Poll” or the “Employee 
Morale Survey.” Any organization which is genuinely interested in 
building and maintaining good employee morale should at intervals 
obtain a measure of its employees’ attitudes by means of one of these 
polls. Such a poll consists of a series of questions which are asked of 
workers on the job to ascertain what they like and dislike about their 
working conditions, supervision, management policies, employee services, 
rates of compensation, and their attitudes toward the personalities and 
competence of top management. The form used is of the multiple 
answer type usually carrying from twenty to sixty specific questions. 
Each of these has four alternative answers. A typical item follows: The 
equipment with which I work is: a. Very satisfactory; b. In 
good condition; c. ————Somewhat unsatisfactory; d. Definitely 
unsatisfactory. 

The employee answers the items on the form by placing a check mark 
before the particular one among the four responses which best represents 
his answer to the question. In this way a uniform measure of employee 
opinion is obtained and no writing is required of the individual. He is 
requested not to sign his name, or identify himself in any way, thus 
insuring that his responses are strictly anonymous. Furthermore, report 
to top management is in summary form, showing the distribution of 
answers to each item for each department. This further insures against 
any possibility of individual identification. This latter is important 
because, if the employee believes that his identity is not protected he 
may be motivated to give untruthful answers that will subject him to no 
risk of retaliation by supervision or others. 
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In practice, opinion polls are usually administered as follows: A 
representative of management makes a brief explanation to the employees 
to be polled, outlining the purpose of the project. Following this, the 
employees are asked each to pick a questionnaire from a pile placed on 
the table, so that there is no danger to an individual that his form has 
been keyed. After he has answered the questions, he puts the form in a 
ballot box on the table. After all the questionnaires have been inserted, 
the box is sealed and given to an outside organization for tabulation. 
This is done to give the employees further assurance that the identity of 
each is protected. Furthermore, never less than ten questionnaires are 
placed in any one box to avoid any additional possibility of identification. 

Wherever possible, a separate ballot box is provided for supervision 
and for each department, shift, and supervisory unit. Where sufficient 
numbers are available to make it feasible, men and women have different 
colored questionnaires. Thus it is possible to analyze the results sepa- 
rately for supervision, for men and women, and for each department, 
shift and supervisory unit. 

While some advocate the administration of these polls by mail, it is a 
practice to be discouraged. First, only a part of the employees will 
answer. Consequently, it is impossible to be certain of the extent to 
which the sample is truly representative. Second, a questionnaire sent 
to the home is likely to be filled out in consultation with other family 
members and often associates in the union. As a result, it is difficult to 
ascertain the extent to which the opinions expressed are those of the 
employee. Third, the fact that the employee is asked to mail in the form 
may make him question the degree to which his identity has been pro- 
tected since it is easy to key individual forms. This in turn, may influ- 
ence his responses by leading him to play safe by expressing few dissatis- 
factions. 

The contribution of the employee opinion poll to the building and 
maintenance of employee morale is that it serves as an important medium 
of communication between the employees and management. It is neces- 
sary because, unfortunately, as companies grow larger, there is a tend- 
ency for relationship between the man on the machine or at the desk and 
top management to become increasingly distant. This problem does not 
exist in the smaller organization where direct personal contacts exist. 
Where this personal contact has been lost, communication between 
management and the worker begins to break down. On the one hand, 
there is a tendency for company policies, as they are transmitted down- 
ward from, management through the hierarchy of supervision to the 
worker to undergo modification or even distortion. A policy may have 
been sound and fair when it was approved by top management, but by 
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the time it is applied, it may have become so altered that it is no longer 
recognizable and may be far from sound and fair, yet because of this 
failure of communication, management is quite unaware of any change. 

On the other hand, in communication upward from the worker to 
management, employee dissatisfactions and grievances are frequently 
blocked by members of supervision. In any organization there are 
always some who are desirous of creating the impression with their 
superiors that there is no dissatisfaction among their subordinates. 
Hence, they make every attempt to repress and conceal evidences of 
poor morale among those who report to them. Nor do they encourage 
their subordinates to bring their troubles to them. In consequence, top 
management is often not kept informed of what the employees are actu- 
ally thinking or is not made aware of legitimate complaints which they may 
have. As a result of these failures of communication, employees are 
frequently subjected to unnecessary injustice and frustration. The re- 
sulting dissatisfactions, being denied outlet or redress, tend to accumu- 
late, and create a generalized hostility toward the company and manage- 
ment in general. This is invariably destructive to morale. 

The employee opinion poll is designed to provide one avenue of com- 
munication between the workers and top management. It provides the 
man on the machine or at the desk with an opportunity, free from any 
danger of retaliation by his superiors, to express frankly and without 
reservation his likes and dislikes with respect to his job, the people with 
whom he is associated and the company as a whole. Thus, the opinion 
poll brings the principal sources of employee dissatisfaction to the atten- 
tion of top management, enabling it to take steps to eliminate them and to 
take the initiative in counteracting their effects. It also locates those 
sore spots in the company organization which require immediate attention 
so that corrective measures can be applied where they are most needed 
first. 

Frequently, the chief sources of employee ill-will and poor morale are 
not such major items as wages or hours, but rather are the many petty 
annoyances to which workers are constantly subjected. Typical of 
these latter are warm drinking water, drafts, poor illumination, inade- 
quate lockers, the number and placement of the time clocks; even such 
items as the fact that in the company cafeteria supervision may get 
larger helpings than do rank and file employees. The reason why these 
trivial and often picayunish complaints are of such major importance is 
the fact that they are repeated every day; they are inescapable. Further- 
more, because they cannot “talk them out’’ with immediate supervision, 
and there are no channels of communication with top management, these 
dissatisfactions tend to accumulate and create anti-company attitudes. 
This is a serious causative factor in labor trouble. 
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The opinion poll also has a salutary effect upon employee morale in 
another direction: It provides the workers with an outlet for their dis- 
satisfactions, thus serving as a cathartic agent. Having gotten their 
troubles off their chests, they feel much better. Likewise, the fact that 
management has evidenced sufficient interest in them as individuals to 
provide them with an opportunity thus to express their likes and dislikes 
tends also to create a bond of sympathy and goodwill between them and 
management. 

The major resistance to the use of employee opinion polls arises not 
from employees or labor organizations, as might be expected, but from 
management itself. In one case in the author’s experience a C.I.0. local 
itself initiated a request for an employee opinion poll in a labor dispute 
and agreed to abide by the findings. Management, on the other hand, 
flatly refused to permit such a poll to be conducted, even though the work 
was to be done by an independent, outside organization. 

In view of the contributions which the opinion poll can make to em- 
ployee morale it is, at first glance, curious why it encounters so much 
resistance by top management. Actually, however, the explanation is 
relatively simple. For all of their prestige and authority, members of 
management are frequently extremely insecure. 

Many are surprisingly lacking in self-assurance, the feeling of being 
on top of their jobs; some consciously, others so beset by anxiety that 
they cannot face their weakness and must repress the entire conflict. 
In view of this, they dare not approve the use of such an instrument, the 
employee opinion poll, which might reveal their own shortcomings. In 
short, many lack the courage to face unpleasant and uncomplimentary 
revelations relative to the kind of a job they are doing. Unfortunately, 
such polls again and again reveal conditions which management prefers 
not to face. Furthermore, the very act of asking for employee opinions 
also commits management to take action to correct conditions found to be 
bad when it would be easier not to. Typical of the conditions an em- 
ployee poll is apt to reveal are: 1. Poor operating methods; 2. Undesirable 
working conditions; 3. Weaknesses in supervision; 4. Inconsistencies and 
inequities in company policies; and 5. Hostilities toward top management 
itself. Inasmuch as many of these reflect directly on management’s 
competence, it is obvious that many executives are not eager to have 
them brought to light. 

Under such circumstances management usually feels it safest to let 
sleeping dogs lie. At the same time, however, management cannot admit 
or may not even be consciously aware of the true reasons for its reluct- 
ance to permit the conduct of an employee opinion poll. Hence its 
response to this threat to its security is to provide plausible rationaliza- 
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tions for its attitude. Where management’s anxieties are so powerful 
that they cannot be faced at all, the excuses given for refusing to permit 
an employee opinion poll are sincerely believed by the executives them- 
selves. 

The most common reasons advanced by management for refusing to 
permit employee opinion polls to be conducted are: 


1. The poll will “suggest” dissatisfactions, thus creating poor morale 
and ill-will toward the company. 

2. The conduct of the poll will upset employees emotionally, distract 
them from their work, and result in much discussion of these matters 
both on and off the job, to the disadvantage of operating efficiency ; there 
is both a direct and indirect loss of production time. 

3. Many employees will refuse to answer the questions or will give 
silly or irrelevant responses. 

4. The bringing of issues such as those covered by an opinion poll out 
into the open will give the unions ammunition to attack the company and 
even lead to open outbreak of labor troubles. 


Actually, none of these criticisms is likely to be valid. Numerous 
polls have been conducted in both large and small organizations through- 
out the country. Where they have been properly administered, none of 
the conditions which management fears have developed. Specifically, 
the experience of companies which have used them properly has been as 
follows: 


1. The charge that polls create dissatisfaction: This criticism has been 
proved false by two factors: First, it is almost always found that there are 
marked differences in the kind and degree of dissatisfaction found from 
department to department. If the poll itself were creating this dissatis- 
faction, there would be a much greater degree of uniformity. Second, 
a further more detailed investigation of the dissatisfactions voiced on the 
poll has revealed almost in every case that they actually exist. In short, 
the polls bring out only real dissatisfactions which already exist and do 
not cause employees to imagine new ones. 

2. Polls require too much time and distract employees: Even the longest 
polls require not in excess of thirty minutes to explain, administer and 
collect. Rarely is there much discussion after the poll has been conducted. 
The fact that the employees have had an opportunity to get their dis- 
satisfactions off their chests has the effect of reducing existing emotional 
tensions. They actually feel better. It is true that in some instances 
there is a certain degree of “kidding” of supervision after the poll has 
been taken, e.g., an employee will go to his supervior, if he likes him, 
shake his hand, and say “Glad to have known you because, after manage- 
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ment has learned what we think of you, you won’t be with us any longer.” 
This actually is a sign of good morale. Rarely, if ever, are employees 
upset by a poll. 

The executives who decry a loss of production time are obviously 
sticking their heads in the sand. They refuse to face the obvious fact 
that improved employee morale invariably means increased productivity. 
Their motives become transparent when one observes how quickly they 
buy costly plant equipment based on this same argument of future 
productivity. 

3. Employees will not answer the questions or will give irrelevant re- 
sponses: If the purpose of the poll has been sufficiently explained in ad- 
vance, not to exceed one or two per cent of the employees will give 
“smart aleck”’ responses, and practically none will refuse to answer. 
Even where employees are strongly anti-management, the opportunity 
provided by the poll anonymously to tell management what they think 
of the company is so attractive that the individual can rarely resist the 
opportunity to unburden himself in full. Where irrelevant and “‘smart 
aleck” responses are written in, this in itself is an indication of serious 
hostilities toward, or distrust of, management. 

4. They will lead to outbreaks of labor troubles: Opinion polls are 
ordinarily never discussed with the union in advance of their administra- 
tion (otherwise they might be “framed’’). This is sometimes resented, 
but at least in the author’s experience, no reluctance is encountered on 
the part of union members to answer the questions, nor are there un- 
fortunate consequences resulting from the administration of the polls. 
In one instance, the president of a local union announced that no member 
of the union would answer the questions on the poll. In spite of this, 
every member responded in full and even the president once he saw the 
questionnaire, found it impossible to resist the temptation to tell the 
management in strict confidence what he thought of it. 


In another case, a group of organized railroad employees were given 
an opinion poll. Here each department had a “griever,” and manage- 
ment feared that immediately each griever would call headquarters and 
the employees would be called on strike. Actually, not a single griever 
called the general chairman, and the immediate effect of the poll was a 
marked and consistent improvement in employee attitudes. 

As already indicated, at least in one case, to the author’s knowledge, 
a union has asked for a poll. In short, if the labor organization has 
confidence in the integrity and essential fairness of the company, there 
is no reason to anticipate trouble of any sort. The best proof of this is 
the fact that polls of this character have been conducted in numerous 
companies which have been organized by strong, aggressive unions, and 
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in no instance, in the author’s experience, has trouble resulted from the 
taking of the poll. 

Since the basic reason for management’s resistance to opinion polls 
lies in its own anxieties, the only sure way to overcome it is to allay 
management’s fears for its own security. Four methods have proved 
helpful in dealing with this problem: 


1. By referring management to executives of other companies which have 
conducted these polls successfully. If executives in these latter companies 
explain that, at least in their experience, the polls not only do not cause 
trouble, but are sources of valuable information, this is extremely helpful 
in allaying management’s fears. It is particularly advantageous to 
arrange contact with companies which have conducted not one, but a 
series of these polls, because this practice tends to give clear evidence 
that the polls are not only free from danger, but actually contribute 
information which is helpful. 

2. By making it clear to management that the findings will be handled 
confidentially. In short, only those executives who are most likely to be 
injured by the findings will be shown the complete report. While it is 
essential that groups of employees be informed of those particular findings 
which relate to them, it is not necessary for them to be given the overall 
picture. Consequently, if conditions are unusally bad, no one but top 
management need know the extent and sources of employee dissatis- 
faction. In this manner, management is enabled to save face, while at 
the same time the findings may be employed constructively. 

3. Where the poll is recommended by an outside consultant, he can accept 
the responsibility for the effect of the poll upon the employees. If he is 
willing to stake his reputation upon the outcome a the poll, management 
is sometimes willing to take the chance. 

4. Sometimes management will consent to the trial of a poll in one depart- 
ment on a pilot basis. Later, reassured, it will go ahead with the entire 
organization. 

In actual practice there is only one danger associated with the use of 
employee opinion polls. This is the failure of management to do its 
part in the correction of conditions revealed by the questionnaire. Fre- 
quently, a poll of this character brings out conditions which are either 
embarrassing to management or presents it with rather difficult prob- 
lems. For example, the employees of a particular department may ex- 
press strong hostilities toward their foreman. They may charge him 
with incompetence, with playing favorites, even with open dishonesty. 
On the other hand, this man may have been with the company for many 
years. In addition, the management may regard him quite highly be- 
cause his costs are low and he is an excellent technician. Yet in spite of 
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these favorable factors, it becomes incumbent upon management either 
to give this supervisor special leadership training, or, in extreme cases, to 
transfer or replace him. This latter may confront management with a 
serious problem, either because of the man’s length of service or because 
it has no one with whom to replace him. In spite of this, it is es- 
sential that management if it is to hold the confidence and respect of its em- 
ployees, must take action. If the executives dodge the issue and take no 
action, hoping wishfully that the employees will forget their dissatis- 
factions or that the man will reform voluntarily, they will be doing them- 
selves a great disservice. Not only will the initial cause of the dissatis- 
faction, i.e., the department head or foreman, remain, but a new ground 
for distrusting management will have been established. The employees 
will feel with justification that management has acted in poor faith. It 
has asked them to give their frank opinions with the implied promise 
either that conditions would be corrected or that at least management 
would give them an explanation of why action has not been taken. 
Where nothing is done, employees rightfully feel that they have been 
given the run-around and that their confidence has been abused. Under 
such circumstances subsequent employee opinion polls or other manage- 
ment projects will not be accepted readily by the workers. 

On the other hand; if company executives will be honest with them- 
selves and with the employees and make a genuine effort to eliminate the 
conditions revealed by the opinion poll, or if this is not possible, have a 
frank and open discussion with the men and women involved and give 
them an explanation of why changes cannot be made, the effects are 
rarely other than desirable. The employees have, often for the first 
time, tangible evidence that management is sincerely interested in their 
welfare, and is honestly attempting to improve conditions. Where this 
is done, not only are many of the basic causes of dissatisfaction eliminated 
and the employees given an opportunity to relieve their tensions by 
expressing them on the questionnaire, but an atmosphere of mutual con- 
fidence is established which is of primary importance in building and 
maintaining good morale. 

Thus, where an employee opinion poll is properly conducted and full 
use is made of the findings, its effects cannot be otherwise than helpful, 
not only from the standpoint of the building and maintenance of employee 
morale, but in such by-products as improving levels of production and 
reducing absenteeism and turnover, since the latter conditions arise in 
large part from unrelieved employee dissatisfactions. 


Received March 20, 1946. 








Signed Versus Unsigned Personal Questionnaires 


Robert P. Fischer 
University of Illinois 


« Despite considerable literature on questionnaires, personality inven- 
tories, attitude scales, etc., there is a paucity of empirical data on the 
influence of signatures upon the results obtained with these devices. 
Many writers have recommended keeping questionnaires, etc., anony- 
mous, but presemably have done so on the basis of personal conviction 
rather than on a basis of any factual observations. Usually it has been 
asserted that where the information sought is of a personal nature, or, as 
in the case of an attitude study, where the respondent’s views are likely 
to be in disagreement with those held by the examiner, anonymity is 
essential in order to obtain honest information. Such claims are, how- 
ever, for the most part unsubstantiated. 

The present writer is aware of only two studies in which the influence 
of signatures upon questionnaire data has been investigated. Olson ' 
studied the influence of waiver of signature on personal reports. Using 
the Woodworth-Mathews Personal Data Sheet, which is designed to 
measure emotional stability, Olson examined 100 upperclass women who 
were preparing to be teachers. One group of 60 women was given the 
data sheets with instructions not tosign them. Immediately after finish- 
ing the task they were instructed to again fill out the data sheet but this 
time to sign their names. These conditions were reversed for another 
group of 40 students. On the basis of the results he obtained, he con- 
cluded that, ‘‘There is thus a high probability that more symptoms will 
be reported in an initial application of the instrument when names are 
omitted.” He further concluded that, “The initial experience, however, 
appears to establish a set or memory factor which prevents large changes 
on the second application to the same group.” 

Corey ? in a study of the influence of signatures on attitude ques- 
tionnaires, however, did not get the same results. He used a question- 
naire designed to measure attitude toward cheating on examinations. 
By use of concealed pin pricks he was able to identify the anonymous 

1'W. C. Olson. The waiver of signatures in personal data reports. J. appl. Psy- 
chol., 1936, 20, 442-450. 


*8. M. Corey. Signed vs. unsigned attitude questionnaires. J. educ. Psychol., 
1937, 28, 144-148. 
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papers of the 150 college students used in his study. He had the stu ‘ents 
first respond to the questionnaire but not sign their names. Afte’ the 
papers were collected he had the group respond in the same way except 
that he had them sign their papers. He discovered no statistically 
significant differences between the mean scores obtained under the two 
conditions of administration. The reliability of the questionnaire was 
equally high under both conditions being .93 for the unsigned question- 
naires and .90 for the signed ones. The coefficient of correlation be- 
tween the scores made under the two conditions was .85. From these 
results he concluded that, ‘“—the concern of investigators over the 
invalidating effects of a signature may have been exaggerated.” 

Several factors may have been operating to make the results of these 
two studies dissimilar. In the first place, there may have been some 
differences in the subjects used. There may have been differences in the 
set of the two groups. Olson’s subjects, it will be recalled were upper- 
class women preparing to be teachers. It is possible they may have 
tried to anticipate the nature of the experiment and thus reacted differ- 
ently from Corey’s subjects. Then too, there was a difference in the 
tests used. Whatever may have caused the variation in results in the 
two studies reported above, the problem of the influence of signatures on 
questionnaire type data still needs some investigation. The present 
study was carried out to throw further light on the problem. The College 
Form of Mooney’s Problem Check List was administered under two sets 
of conditions, first with signatures and then without, to obtain the data 
for this study. 

The College Form of Mooney’s Problem Check List * is designed to 
aid students in the expression of their personal problems. It consists of 
330 items and 5 questions. Each item is intended to suggest a possible 
personal problem, while the 5 questions are included for summary pur- 
poses. The 330 items are classified under eleven general headings which 
are shown in Table 1. The student is instructed to read through the 
list of items, underline those which are of concern to him and then to go 
back over the list of underlined items and circle those which are of most 
concern to him. 

There is a place provided on the check list for the student’s name but 
whether he signs the list depends upon the use that is to be made of it. 
When some counseling follow-up is intended the student is instructed to 
sign the list. If the data are to be used in group form, for example in 
surveying the problems of a specific class, signing is presumably regarded 
as inadvisable. It was in order to determine the influence of signing the 


*R. L. Mooney. Problem check list. Columbus, Ohio: Bureau of Educational 
Research, Ohio State University, 1941. 
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check lists on the number of items underlined and circled that the follow- 
ing study was carried out. 


Conditions of Present Study 


The Mooney check list was given to the students in two of the writ- 
er’s classes in psychology at the University of Illinois. They were ad- 
ministered at the beginning of the eighth week of the fall semester, 1944- 
45. The responses of 56 sophomore women in a class in general psych- 
ology and 46 upperclass women (mostly juniors and seniors) in a class in 
industrial psychology were used in this research. The students in the 
class in general psychology were well acquainted with each other having 
been in several other classes together. While this was not true of the 
women in the class in industrial psychology, every effort was made to see 
that these students became acquainted with each other during class. 
The degree of rapport with both classes was high. The writer had had 
conferences with most of the students and had tried in every way to 
establish friendly relations with them. Compared with previous experi- 
ence with other classes the cooperativeness of these two groups of stud- 
ents was outstanding. 

In both classes the students were on topics at the time of participating 
in this research which made the use of the check list appear quite in order. 
Both classes were studying personality and its measurement in the general 
context of personal adjustment. 


The check lists were given to the two groups of students on one Monday 
with the following instructions: ‘““You have been studying personality and 
its measurement for over a week now. One of the best ways to really learn 
about the measurement of personal adjustment is to take some of the tests 
we have been discussing yourselves. erefore, today I want to give you the 
College Form of Mooney’s Problem Check List (this check list had not been 
discussed previously). Now this is not a test in the usual psychological sense 
but consists of a number of things which might be problems to you. I want 
you to read the instructions on the check list and then proceed to fill it out 
as honestly and — as you can. Your papers will be kept strictly con- 
fidential and later I will discuss them with you individually. You should get 
at least two benefits from doing this project. First, you should learn some- 
thing first hand of the nature of problem check lists and second, you should 
gain some insight into the things that are disturbing you. Please be as frank 
and honest as you can or the results won’t be of any value to you. Remem- 
ber, no ‘one else besides me will see your papers.” They were then told to 


On the following Monday the check list was again given to the same stu- 
dents.‘ This time it was given with the following instructions: ‘You will 
remember that a week ago you each filled out a problem check list. I have 
gone over them and am ready to discuss them with you. In scoring them 


‘The students who were not present at the first administration were dismissed 
before class began. The interim of one week should avoid the memory factor pointed 
out by Olson and probably also present in Corey’s results. 
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I noted some interesting trends and would like to make some group sum- 
maries. Of course I will not turn your signed check lists over to an assistant 
to work on but I do want them summarized. I wonder if you will fill one out 
today but this time do not sign it. This way my assistant can work on them 
and not know whose they are. Please-be as frank and honest as you can and 
do not sign the check lists or mark them in any other way that may identify 
them.” The students were then told to proceed. The writer then left the 
room in charge of an assistant to allay any suspicion that the papers might 
be identified. 


During the interim between testings no mention was made of the 
results of the first administration of the check list and no other check list 
or personality tests were given. In fact, the classes were, for all intents 
and purposes, through with the topic of personality. 


Results 


A preliminary analysis of the results made by the two classes, both 
when the check lists were signed and when they were not, showed that 
they did not differ appreciably. Significance tests were applied to the 
small differences obtained between the two classes but in no case were 
the differences statistically significant. Accordingly the papers of the 
students in the class in general psychology were combined with those of 
the students in the class in industrial psychology into a single group of 
papers. There were 102 such papers on the first testing (with signature) 
and 94 on the second testing (without signature). This difference of 8 
was due to an absence of 8 students at the time of the second testing who 
were present at the time of the first testing. There is no reason to sus- 
pect that these 8 cases would have altered the results any. Unfortun- 
ately, since attendance was not taken (to further allay suspicion) it was 
not possible to identify the 8 missing papers and thus the papers of these 
absentees were not removed from those obtained at the first testing. 

The mean number of problems underlined when the check lists were 
signed was 34.37 while the corresponding mean when the check lists were 
unsigned was 36.00. This difference of 1.63 had a critical ratio of 0.54. 
Obviously this was not a statistically significant difference. The mean 
number of problems circled (serious problems) on the signed papers was 
8.11 and 11.32 on the unsigned papers. This difference of 3.21 had a 
critical ratio of 2.38. While a critical ratio of this magnitude is not the 
conventional 3.00 often demanded for statistical significance, it is great 
enough to indicate a difference in directionality of the above two means 
which is probably due to something other than chance. It shows that 
there tended to be significantly more serious problems on the unsigned 
check lists than on the signed ones. 

Table 1 shows the mean number of problems underlined (total prob- 
lems) in each of the eleven areas on both the signed and unsigned check 
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Table 1 
Problems Underlined by Group With and Without Signatures 
Mean Mean 

No Critical 

Problem Areas Names Names M,-M; Ratio 

Health and Physical Development............ 2.863 2.734 —.129 — .374 
Finances, Living Conditions, and Employment... 1.294 1.479 .185 -703 
Social and Recreational Activity............. 3.324 3.969 645 1.352 
Social-Psychological Relations............... 3.481 3.192 — .239 — .508 
Personal-Psychological Relations............. 4.794 4.767 — .027 — .048 
Courtship, Sex, and Marriage................ 3.147 3.490 .343 .940 
EE 6 Saige chwacwaes ci anank sures 2.186 1.873 —.313 —.801 
Morals and Religion. . 9 din-sncdo 4:6,gns cia 2.745 .206 544 
Adjustment to College Work. . i we gneisses ae klp Ok Ca 3.804 4.182 378 .696 
The Future—Vocational and Educational.... .. 5.029 4.980 — .049 — .092 
Curriculum and Teaching Procedure.......... 1.961 2.586 625 1.586 
MT ire «Saka aaa its > - o's « agiepailede-o¥ ie 34.372 35.995 1.623 543 





1 Results obtained from the original data, not from the figures shown above. 


lists. It further shows the differences in the above means along with the 
critical ratios of these differences. Table 2 shows comparable data for 
the circled problems. 

From Table 1 it may be seen that in none of the eleven areas were 
there any significant differences in the mean number of problems under- 
lined by the group with as compared without signatures. In general it 
may be assumed that whether or not the group signed the check list was 
of no importance in determining the average number of problems under- 
lined in the eleven areas. It will be noted that the greatest differences 
were in the areas headed: ‘Curriculum and Teaching Procedures,” and 
“Social and Recreational Activities,’ but these differences were not 
statistically significant. 

From Table 2 it may be seen that there were fairly significant differ- 
ences in the number of problems circled (serious problems) in three of the 
areas by the group with as compared with the group without signature. 
There was a tendency for a higher average number of serious problems in 
the areas headed, “Curriculum and Teaching Procedures,” “Finances, 
Living Conditions, and Employment,” and “Social and Recreational 
Activities,” when the students did not sign their check lists than when 
they signed them. 

In general, therefore, it may be assumed that the students in this 
study tended to circle more problems when their names were withheld 
from the check lists than when their names were used, but that there was 
no significant difference in the number of problems underlined under the 








Signed Versus Unsigned Personal Questionnaires 


Table 2 
Problems Circled by Group With and Without Signatures 








Critical 
Problem Areas M:-M: Ratio 





Health and Physical Development 569 . 144 1,021 
Finances, Living Conditions, and renee .225 / .254 2.153 
Social and Recreational Activity . . fcc wae d .354 1.924 
Social-Psychological Relations . ites. Ee ‘ .178 1.066 
Personal-Psychological Relations. . we Saka tis shoes , 251 .893 
Courtship, Sex, and Marriage .990 , 435 1.851 
ES ‘ 047 .220 
Morals and Religion .549 d .291 1.516 
Adjustment to College Work. . ‘svn Ree t .583 1.685 
The Future—Vocational and Educational. . 1.549 d .291 977 
Curriculum and ditnaees Procedure . aoe ae .383 2.424 

Total’. Pung ee eee 11.319 3.211 2.384 





1 Results obtained from the original data, not from the figures shown above. 


two conditions. In view of the relatively small sample used and the 
special conditions of rapport, it is not possible to assume that the differ- 
ences observed were necessarily due to the withholding of signatures nor 
would it be reasonable to generalize these results to other populations. 


Nonetheless, the fact that the use of signature on personal questionnaires 
may influence the honesty and frankness of students, as indicated by the 
number of personal problems checked, seems a distinct possibility. 
Further evidence for this possibility lies in the similarity of the results of 
this study with those of Olson noted earlier. 


Summary 


The College Form of Mooney’s Problem Check List was given to 102 
upperclass women students in psychology first with and then without 
signatures being used. The interim between testings was one week. 
The results indicated that the mean number of problems underlined 
(total problems, presumably not serious) did not vary significantly under 
the two conditions of administration but that the mean number of prob- 
lems circled (serious problems) tended to be significantly greater when 
signatures were withheld. In view of similar results reported by Olson 
it would appear that the use of signatures on personal questionnaires 
(particularly in the case of highly personal items or serious problems) 
might have a relative inhibitory effect on the honesty and frankness of 
the people responding to them. 


Received July 11, 1945. 





Age of College Graduation and Success in Adult Life * 


S. L. Pressey 
Ohio State University 


In discussions of educational acceleration it is often argued that al- 
though young students may make admirable college records, early gradu- 
ation starts them into their adult careers while yet so immature, or they 
are so often a “bright boy”’ type of personality, that early promise is not 
fulfilled. Instead, the graduate at average or older age is supposed to 
have a maturity, and a greater variety of experience as in work or travel, 
which results in a more substantial adult career. Now there is the special 
problem as to whether the veteran who returns to school, and finally gets 
started in his civilian career older than usually occurred before the war, 
will gain or lose because of his greater age; if the latter, accelerated pro- 
grams for veterans are suggested, and as a matter of fact are desired by 
many veterans. The data here reported bear on these various related 
matters and seem especially timely now. 


Material and Methods 


The issue was straight-forward and specific: What is the success in 
life-career of students who graduate from college young, at an average age, 
or older? For study of the problem, alumni records of as great complete- 
ness as possible seemed needed. In most institutions such records are 
notoriously inadequate and especially so with reference to the careers of 
the less successful graduates. However, Amherst College has notably 
complete published alumni records from the first graduates to 1938, 
including for almost all former students their vocational careers, status 
in community or profession as shown by honors or other recognition, 
memberships in social or community or professional groups, and family. 
Sources other than the individuals concerned seem often to have been 
used, and exceptional adequacy and accuracy obtained. The volume 
thus seems a mine of information for studies regarding certain problems 
in higher education. 

The purpose of the investigation here reported was to appraise total 
adult careers. Only classes the careers of whose members had been 
largely completed at the time when the data in this volume were gathered, 


* This is the 25th in a series of reports regarding research in the Bureau of Educa- 
tional Research of the Ohio State University, regarding educational acceleration. 
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could therefore be used. But cases were desired as close to the present 
as the above requirement permitted. Various considerations led to use 
of the classes of 1880 through 1900. Clearly an individual who died 
young did not have a chance to show what he could do. Only graduates 
in these classes who had lived to be 50 years or over were accordingly 
included and only those born in this country. The question is as to 
whether or not age of college graduation shows any relationship to the 
adult careers of these individuals. 


Results 


Table 1 shows for graduates at each age the per cent not marrying, 
the average age of first marriage for those who did, and the average 
number of children for both groups together. Youngest graduates are 
slightly more likely to marry and do so youngest; older graduates marry 
later but hardly less frequently. Average number of children runs 
practically the same throughout. 


Table 1 


Number Marrying, Age of Marriage, and Number of Children (entire group) in Relation 
to Age of College Graduation, 1411 American-born Graduates of Amherst 
in the Classes of 1880 through 1900 who lived to be 50 or over 








Age of Graduation 19 20 21 22 23 24 25 26 27 Up 





Number of Graduates 24 114 344 416 228 104 82 37 62 


% Not Marrying 8 6 10 10 8 21 14 11 10 
Av. Ageof Marriage 268 299 298 306 305 31.1 31.5 31.0 34.1 


Av. No. of Children 2.0 1.6 1.8 1.8 1.8 18 19 20 2.0 





Average age of marriage of Harvard graduates of the classes of 1891- 
1900 was 31.0 and 25% of the marriages were childless (8). For the 
Amherst group here dealt with average age of marriage was 30.5 and 24% 
were without children. A more recent study (1) shows 9.2% of male 
college graduates over 40 in this country never to have married. In 
short, the Amherst data in total appears to be reasonably typical, (in 
these and yet other ways which need not here be gone into). It seems a 
fair inference that the striking findings next to be presented are of some 
general significance. 

The material of major importance must now be considered, the find- 
ings as to adult career. Careers were appraised on a scale of seven. 
Individuals who were internationally known were rated ‘7’; “6”’ indi- 
cated national prominence and “5” prominence locally; ‘‘4”’ was for 
average success for a college graduate and “3” for only a mediocre career; 
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“2” indicated relatively unskilled work, ‘‘1” a failure, (not self-support- 
ing), and “0” a criminal or shady record. The first two categories were 
found relatively easy to assign, such criteria being used as inclusion in 
“‘Who’s Who.” Local prominence was considered to be indicated by 
membership in local or regional organizations or similar evidence. At 
the other extreme, no case was found with a criminal or shady record or 
practically throughout not self-supporting; such cases either did not occur 
or were mercifully not so reported. However, an appreciable number 
were classified as “2”; thus one man after rather obviously failing in 
newspaper work had spent most of his life in relatively unskilled factory 
work, while another had gradually dropped back and spent the last 25 
years as a letter-carrier. 

Procedure in rating was simple. The statement about each indivi- 
dual was read and rated by at least two assistants, neither for the most 
part knowing the basic purpose of this study or paying attention to age 


Table 2 


Adult Careers of 924 American-born Amherst Graduates Who Lived to be 50 or Over: 
All Those Graduating Under 21 and Over 25 Between 1880 and 1900 
and the Entire Graduating Classes in 13 of These Years 








Age of Graduation 19 20 21 22 23 4 2% 26 £426 





Number of Graduates 24 #114 #=+%216 235 132 5&8 47 37 60 
After College Success 
% Nationally known 29 22 15 12 10 
% Failures 4 6 6 5 2 


wo ow 
to 
nw 





of college graduation. If the same rating was assigned by both ap- 
praisers, it stood; if the ratings differed by but one, the lower of the two 
ratings was used as probably nearer right in view of the tendency of such 
statements to be over-favorable. If the raters differed more, final de- 
cision was made by a third person in close touch with the study. Such 
ratings were made for all cases graduating under 21 or over 25, in the 21 
graduating classes from 1880 through 1900. To save time, no ratings 
were made on those graduating at the most common ages of 21-24 in the 
classes of ’83, ’84, ’88, ’89, 93, 94, "97, and 99. The cases at these ages 
from the other 13 classes seemed sufficient. On the whole, it is believed 
that these ratings appraised the life careers of these individuals with 
reasonable accuracy, especially at the extremes shown in Table 2. 

The relationship of age of graduation to later success is surely strik- 
ing. A quarter of the small group graduating at 19 was nationally 
known. And the per cents thus known drop steadily until no such cases 
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are found for those graduating over age 26. Moreover, failures are not 
more common among the youngest graduates; they are not more often 
maladjusted or unstable nor do they often, like Williams James Sidis, 
eventually become dismal failures. The number of failures is slightly 
less at 23 and 24, one might hypothecate that these were the regular 
ages when the conventional best adjusted students came through. But 
the differences are so slight as to be of doubtful significance, and it would 
seem best to infer simply that among graduates from 19 through 25 the 
per cent of failures is about the same. After that, failures become more 
common. 

May the above findings result simply from the selective processes 
going on throughout the educational system, bringing it about that the 
brightest students get into and through college youngest, whereas medi- 
ocre or dull individuals enter college a year or two late because of poor 
work in elementary or secondary school, or in college take longer than the 
usual four years? The last possibility seems for the most part not a 
factor here, because the Amherst alumni records seem to list a man as in 
the class with which he entered unless he specifically requests otherwise. 
Thus a man who entered in 1880, but took five years because of poor 
scholarship, would be listed with the class of ’84 and not ’85. But late 
entrance due to poor ability might well operate. 

Forty years and more ago, colleges did not give intelligence tests. 
But a little help can be got, for judging possible relationships of ability 


Table 3 


Age of Graduation and General Ability as Tested at Entrance, 1096 Graduates from 
5 Undergraduate Colleges of Ohio State University, School Year 1941-42 











Under 

Percentile 21 21 22 23 24 25 Up‘ Total 
90-100 24 111 97 29 20 30 311 
80-89 5 56 58 29 12 20 180 
70-79 5 48 51 21 11 21 157 
60-69 1 39 48 23 11 13 135 
50-59 2 29 26 22 10 7 96 
40-49 1 18 31 11 9 9 79 
30-39 2 21 21 17 4 8 73 
20-29 1 9 13 7 2 7 39 
10-19 1 5 7 1 3 17 

0-09 2 3 1 2 1 9 
Md. : 91 80 76 69 70 76 
No. 42 338 355 161 84 116 1096 
% O. 8. U. 4 31 32 14 8 11 Md. 22.5 
% Amherst ll 27 26 15 8 13 Md. 22.4 
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to age of graduation, by seeing what they are now, and considering 
whether the situation at Amherst might not have been similar. Table 3 
attempts this comparison. It shows general ability at entrance, as 
measured by the Ohio College Association Test of General Academic 
ability, for 1096 graduates in the five under-graduate colleges of Ohio 
State University in the school year 1941-42. 

As the medians show, those entering at 20 or younger made some- 
what higher average scores on the scholastic ability tests at Ohio State 
University than those graduating 25 or over. The differences are not 
very great, 15 percentiles between youngest and oldest groups. The 
percentile form of statement, however doubtless conceals the marked 
superiority of some of the youngest students. But many of the older 
students also were of very good ability; over a quarter of those graduating 
25 and older tested in the upper tenth. The two bottom rows show the 
distribution of graduation ages at Amherst and Ohio State University 
to be quite similar except for a somewhat larger proportion of very young 
and also older graduates at Amherst. The inference then is that the 
more frequent successes among the young graduates and more frequent 
failures among the older are not due entirely to differences in ability 
between older and younger groups. 


Interpretation and Application 


How then are the findings regarding relation of success to age of 
college graduation to be construed? There undoubtedly are relations to 
ability, as discussed in the preceding paragraphs. But that factor seems 
not enough. Presumably, somewhat related socio-economic influences 
play a part. The youngest graduates are likely to be those from homes 
of means and of advantages both facilitating education and furthering 
the beginning of life career. The writer, however, believes that there 
are in this situation two other factors now largely neglected, and the 
second of great possible importance. 

In the first place, the older (not the younger) students are often in 
various ways maladjusted. Practically every investigation bearing on 
the matter gives support to this conclusion. Typical are certain of the 
writer’s findings based on the records of 5,977 freshman entering the five 
undergraduate colleges of Ohio State University, and 2,055 graduates. 
Only 19 per cent of those entering over 21 graduated in the regular four 
academic years as compared with 33 per cent of those entering at 16; 
only 18 per cent of graduates over 24 had academic records averaging 
“B” or better as compared with 33 per cent of graduates under 21; 34 
per cent of graduates over 24 did not participate in any way in extra- 
_ curricular activities as compared with only 10 per cent of those graduat- 





Age of College Graduation and Success in Adult Life 231 


ing below 21, and almost three times as many of the younger as compared 
with the older group had held office. The figures regarding activities 
seem especially important as indicative of difficulty in adjusting to 
campus life. All this may be due again primarily to less favorable socio- 
economic status of older students, or to interruptions in schooling. But 
the writer ventures the hypothesis that underlying developmental fact- 
ors, which have unfortunately been almost completely neglected by both 
psychologists and educators, may be yet more important. Surely physi- 
ological changes around the age of twenty, when the organism stops 
growing, are in total more profound than the special and comparatively 
more incidental changes of puberty, when it is well recognized that re- 
percussions in the personality are great. Much scattered evidence sug- 
gests that changes in personality around twenty are also marked. The 
individual becomes more serious and purposeful, desires to marry, to 
establish himself as independent of his family and of tutelage from the 
previous generation. The older student feels belated and out of place 
with youngsters still in later adolescence, and uncomfortable that he is 
not yet into recognized independent adulthood intellectually, socially, or 
economically. All this might contribute to the maladjustment of the 
older student while in college, and also handicap him as he moved into 
his career after graduation.! Surely such incongruities between develop- 
mental level and student status are likely to be more acute with returning 
veterans, who will be older in years and much more in experience than 
the average pre-war student. 

The second factor may well be much more important. It too can 
only briefly be touched upon here but is systematically discussed in 
reference 9. Recent research has emphasized that the prime of life in 
health and vigor, in intellectual creativeness, in energy and enthusiam, 
comes early in adult life. Outstanding discoveries in science, finest books 
in literature, greatest inventions, all such accomplishments tend to be 
made by relatively young men (2, 3, 5, 6,9, 10). In short, it may be far 
more important than is ordinarily realized, that a man be started in his 
life career early in his prime, if his maximal potentialities are to be 
realized. And even a few years delay may well be serious. 

At least such are possibilities by way of explanation of the above data. 
And both the data and considerations such as have been mentioned above 
seem of especial importance now. The colleges are now dealing with 
large numbers of much older men returning from the services or war work. 
Furthermore, a year of military training may be required, again probably 
delaying college graduation and the beginning of adult careers. If the 


' The issues of necessity only briefly touched upon in this and the following para- 
graph are more adequately dealt with in reference 9 and given larger perspectives in 10. 
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above findings indicate a major relationship between age of completing 
full-time education and adult success, issues are presented of the greatest 
importance, both for education and for national policy as regards utiliza- 
tion of the human resources of the ablest group of young people in the 
country. Acceleration, especially of veterans but also of able students 
generally, seems called for. Finally, peace-time conscription appears to 
entail problems ordinarily neglected. 


Summary 


1. The paper reports an effort to determine relations between age of 
college graduation and success in adult life. The unusually complete 
alumni records of Amherst College for graduates in the years 1880-1900 
inclusive, who were born in this country and who had lived to be at least 
50, were the data for the study. 

2. For these classes, it was found that the per cent marrying and 
number of children did not vary significantly with age of college gradu- 
ation. Age of marriage, however, increased with increased graduating 
age. 
3. Most significant was the relationship of graduating age to voca- 
tional success. Success was judged by ratings by two raters working 
independently with arbitration by a third in case of disagreement; the 
ratings seemed of considerable reliability. A steady decrease in per cent 
highly successful, nationally or internationally known, was found from 
youngest graduates to oldest. Those graduating at an older age were 
more likely to have been failures. 

4. Two special hypotheses are offered in explanation: (a) Older 
students were maladjusted to college work and college life, with conse- 
quent handicap in adulthood. (b) Late graduation too much reduced 
the number of most vigorous years in the prime of life which might have 
gone into most energetic initiation of life career. It is believed that such 
findings argue for a judicious acceleration of educational programs, 
especially for veterans, and argue against peacetime conscription. 


Received April 21, 1945. 
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Predicting Success in a School of Nursing * 


A. Q. Sartain 
Southern Methodist University 


It is important to a school of nursing to be able to predict, with as 
great accuracy as possible, the success that an applicant for admission 
will have in the school, and to be able to choose as nearly as possible only 
those likely to succeed. Not only is elimination of those likely to fail a 
service to the weak applicants, but it is usually decidedly to the advantage 
of the school of nursing, since, as Potts ' has pointed out, students are 
usually liabilities rather than assets to a hospital until some months have 
passed. 


Statement of the Problem 


The purpose of this study was to determine the extent to which suc- 
cess in nursing school could be determined from the high school averages 
of students and also from their scores on a battery of tests administered 
by the Nurse Testing Division of the Psychological Corporation (the 
Revised Alpha Examination, Form 8; the Columbia Vocabulary Test; 
the MacQuarrie Test for Mechanical Ability; the Bernreuter Personality 
Inventory; and the Potts-Bennett Tests for Nursing Aptitude—referred 
to in this study as though the two sections comprised a single test). 


History of the Problem 


A number of studies have concerned themselves with the use of psy- 
chological tests in the selection of nursing school candidates, and at least 
two agencies, the Psychological Corporation and the National League of 
Nursing Education, administer their test batteries to many applicants 
each year. The battery used by the Corporation has just been described, 


* The writer wishes to express his appreciation to Miss Merle Mayo, R.N., of the 
Parkland Hospital School of Nursing, who furnished the grade average and the high 
school average for each girl and whose cooperation in obtaining the test scores made 
this study possible; to Mrs. Margaret Scruggs-Carruth, who did a great deal of the 
statistical and other work; and to Miss Edith M. Potts, R.N., of the Nurse Testing 
Division of the Psychological Corporation, who encouraged and cooperated with the 
study. 

1 Potts, Edith Margaret. Use of tests in selecting student nurses advantageous to 
hospital and student. Hospital Management, 1941, 52, 39-42. 

234 














Predicting Success in a School of Nursing 235 


and the League makes extensive use of tests of the Cooperative Test 
Service and also employs the A. C. E. Psychological Examination. 

One of the most extensive investigations in this field was the work of 
Williamson, Stover, and Fiss.2 Although handicapped by a lack of 
records on students who did not complete a full year of work, and by 
varying standards of grading, they found that the Moss Nursing Aptitude 
Test, the Cooperative English Test, the Cooperative General Science 
Test, and the Cooperative Vocabulary Test correlated best with nursing 
school success, with multiple regression coefficients reaching .54 in some 
cases. A similar investigation by MacPhail and Bernard * yielded co- 
efficients of correlation between intelligence test scores and preliminary 
grades of from .42 to .60, though differences between those graduating 
and those failing to do so were not significant in two of the four schools 
studied. Douglass and Merrill ¢ secured a multiple regression coefficient 
of .77 when success in a school of nursing was predicted from scores on 
the Moss Nursing Aptitude Test, the Cooperative General Science Test 
(Part I), the Douglass-Gordon Fraction Test, and the high school per- 
centile rank. The Moss Nursing Aptitude Test and the high school 
percentile rank yielded a coefficient of .75. 

Rainier, Rehfeld, and Madigan * obtained correlations of more than 
.40 between nursing school grades and the Iowa Reading Comprehension 
Test, while Garrison * obtained correlations of .48 between academic 
grades and the Otis Self-Administering Test of Mental Ability and .59 
between nursing arts grades and the Otis. Bennett and Gordon ’ made 
a careful analysis of scores’ on the Bernreuter Personality Inventory and 
concluded that the test was of little or no value in predicting success in 
nursing school. Finally, Scruggs-Carruth * made a preliminary study of 
the subjects employed in the present group. 


* Williamson, E. G., Stover, R. D., and Fiss, C. B. The selection of student nurses. 
J. appl. Psychol., 1988, 22, 119-131. 

* MacPhail, A. H., and Bernard, W. Ten years of intelligence testing. duc. & 
Psychol. Meas., 1943, 3, 157-165. 

* Douglass, H. R., and Merrill, R. A. Prediction of success in the school of nursing. 
Univ. Minn. Stud. in the Prediction of Scholastic Achievement, 1942, 2, 17-31. 

5 Rainier, R. N., Rehfeld, F. W., and Madigan, M. E. The use of tests in guiding 
student nurses. Amer. J. Nursing, 1942, 42, 674-682. 

* Garrison, K. C. The use of psychological tests in the selection of student nurses. 
J. appl. Psychol., 1939, 23, 461-472. 

’ Bennett, Geo. K., and Gordon, H. Phoebe. Personality test scores and success in 
the field of nursing. J. appl. Psychol., 1944, 28, 267-278. 

* Scruggs-Carruth, Margaret. The predictive value of nursing school tests. Un- 
published Thesis: Southern Methodist University, Dallas, Texas. 1944. 
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Subjects and Conditions of the Study 


Eighty-one girls from the Parkland Hospital School of Nursing in 
Dallas, Texas, comprised the experimental group. These girls took the 
Psychological Corporation tests in 1942, and were admitted to the School 
of Nursing regardless of their scores on any of the tests. The criterion of 
success was the average grade earned by the student in all courses by the 
end of six months of training, or at the time the girl left the School of 
Nursing, if she was not in school six months later. Sixty-nine of the 
girls were still in school at the end of six months, and twelve had left, in 
almost every case because of failing grades. The faculty members who 
examined the average grades thought them to constitute in the main 
accurate measures of actual success in the School. 


Results of the Study 


The only scores available on the Bernreuter Personality Inventory 
were percentile scores on Emotional Stability, Dominance, Extraversion, 
and Self-Sufficiency. Table 1 gives the results of correlating the scores 
on these traits with the grade average. Since these coefficients are 


Table 1 
Correlation Between Grades and Percentile Scores on the Bernreuter 
Personality Inventory 
Trait r 
SPL... ot coe aniee bess cect cotta renee 29 
po URNS | 0 LB 20> FY ES GC POMEL. at he 26 
Meitcenl Ba ey is a 05 5:5 6265 1 ie ohn + 65s een dee we 19 
SR, Tiidih in. <n. tee ail edie tin anmellinataaiiis 4% 17 


based on percentile ranks and since they are low, as would be expected 
from other studies, they are not used further in this study. It is evident 
that they are not high enough to be of practical value, and there is little 
reason to believe that the use of the actual scores instead of percentile 
ranks would materially alter this situation. 

One of the difficulties encountered was the determination of compar- 
able high school averages for each girl,® since grades might be in terms of 
letters or numbers or even descriptive adjectives, inasmuch as the girls 
came from widely scattered high schools. The method adopted for this 
study was to give an “A” (or other highest category) a value of 95 (with 
an “‘A plus” valued at 98 and an “A minus” valued at 92), a “B” a value 
of 85, and so on, and thus to convert all grades to a numerical basis. 
Needless to say, this introduced errors into the calculations, but obviously 


* No high school record was available for four girls. Consequently, correlations 
involving high school average were based on 77 cases only. 
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these differences in grading systems constitute one of the serious limita- 
tions of the high school average as a means of prediction. This can be 
overcome, however, when, as in Minnesota, Ohio, and other states, the 
high schools can be persuaded to report high school averages in terms of 
relative rank. 

Table 2 gives the product-moment coefficients of correlation found by 
intercorrelating the variables. It should be noted that the test scores 
are total scores, no account being taken of the fact that some of the tests 
yield scores on two or more subtests. 


Table 2 


Coefficients of Correlation Between Variables, and Mean and Standard 
Deviation of Each Variable 








Potts- Army Col. H.S. Mac- S.ofN. 





Variable Bennett Alpha Voc. Ave. Quarrie Ave. Mean 8.D. 
Potts-Bennett .776 .789 429 486 677 165.1 42.40 
Army Alpha — — .776 387 893 559 122.6 22.89 
Col. Vocab. — -- _ 392 365 517 73.3 11.73 
H.8. Average _ + — _ 199 460 84.3 5.52 
MacQuarrie —_ — —_ _ — 356 171.1 29.70 
Sch. of Nurs. Av. — _ — — — — 80.2 8.22 





Table 2 reveals that the correlation between the criterion of success 
(school of nursing average) and scores on the Potts-Bennett test is quite 
high. The Alpha and Columbia Vocabulary tests also correlate with the 
criterion fairly highly, while the high school average and particularly the 
MacQuarrie do less well. Clearly, scores on the Potts-Bennett alone aid 
considerably in predicting success (improvement over chance 26.4%, 
S.D..s; 6.05). And of course, as Tiffin '° suggests, if one can maintain a 
low selection ratio, the test has even more utility than these figures indi- 
cate. 

An additional question concerns the possibility of improving predic- 
tion still further by the best combination of test scores and high school 
average. The multiple regression coefficient obtained from the values 
in Table 2 was .707 (improvement over chance 29.3%, 8.D.c.+ 5.81), and 
the formula for the best prediction of the criterion was (in terms of Beta 
coefficients) : 


Sch. of Nursing Av. = .584 P-B + .103 Alpha — .118 Col. Voc. + 
.209 H.S.Av. + .033 MacQ 


% Tiffin, Joseph. Industrial psychology. New York: Prentice-Hall, Inc., 1942, 
p. 33. : 
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Expressed in terms of raw scores the formula was 


Sch. of Nursing Av. = .113 P-B + .037 Alpha — .083 Col. Voc. + 
311 H.S.Av. + .009 MacQ + 35.18. 


Another multiple regression coefficient that was determined involved 
the prediction of the school of nursing average from the tests alone (with- 
out high school average). Here the coefficient was negligibly higher than 
for the Potts-Bennett alone, .680. And when the criterion was predicted 
from the Potts-Bennett and the high school average, the regression co- 
efficient was .702, and the formula for prediction was (in terms of Beta 
coefficients) : 

Sch. of Nursing Av. = .588 P-B + .208 H.S.Av. 


In terms of raw scores the formula was 
Sch. of Nursing Av. = .114 P-B + .309 H.S.Av. + 35.28. 


It is evident from these facts that the Potts-Bennett Tests alone do 
a creditable job in predicting success in this particular school of nursing, 


Table 3 


Coefficients of Correlation Between School of Nursing Average and Subtests of the 
Potts-Bennett and MacQuarrie Tests 











Test r 
Potts-Bennett 
Science Information...................-..4.5. 691 
Paragraph Comprehension.................... .630 
ha LS on oy karen sue 6 at Kae 352 
PD I el ee ccc ccc eet cece ccyecs Mae 
Gees Cis 6 ccs. oA aa 443 
M : 
Speed and Coordination...................... .268 
Mechanical Insight.................. 6. ee eee. 356 





and indeed that neither the other test scores nor the high school average 
(nor both together) make any substantial improvement in this prediction. 

Although they did not figure in the multiple regression coefficients 
referred to above, correlations were obtained between the school of 
nursing average and the subtests of the Potts-Bennett and the Mac- 
Quarrie. Table 3 gives these coefficients. From this table it appears 
that the subtests are no better than the complete tests in their predictive 
value, and there is little reason here to believe that a multiple regression 




















Predicting Success in a School of Nursing 239 


coefficient here would alter this conclusion. And of course, the lowered 
reliability likely to result from the use of any shorter test alone makes 
this course of action unwise at least until additional work is done on this 
point. 


Summary and Conclusions 


There were obtained, for approximately 80 girls entering a school of 
nursing, the high school average and scores on the five tests generally 
used by the Nurse Testing Division of the Psychological Corporation. 
These tests were the Revised Alpha Examination, Form 8; the Bernreuter 
Personality Inventory; the Columbia Vocabulary Test; the MacQuarrie 
Test for Mechanical Ability; and the Potts-Bennett Tests. The average 
grade earned in the school of nursing during the first six months (or up to 
the time of withdrawal if the girl did not complete six months of training) 
was used as the criterion of success, and these grades were correlated with 
the test scores and the high school average, and intercorrelations were 
determined for scores on the various tests (except Bernreuter) and the 
high school average. Correlations were also worked out for the subtests 
of the Potts-Bennett and the MacQuarrie. On the basis of the study the 
following conclusions seem to be justified: 

1. The Potts-Bennett Tests were fairly effective in predicting success 
in the school of nursing, the coefficient of correlation being .677. 

2. Addition of the other tests (not including the Bernreuter) to the 
Potts-Bennett improved the predictive value by a negligible amount 
(R = .680). 

3. Addition to the above of the high school average yielded some 
increase (R = .707) but probably not enough to justify the time and 
energy involved. The high school average used, however, was not a 
measure of relative scholastic ability. 

4. The Potts-Bennett Tests and the high school average yielded a 
multiple regression coefficient of .702. The Potts-Bennett Tests alone 
were thus almost as efficient as any combination studied. 

5. The subtests of the Potts-Bennett and the MacQuarrie in general 
correlated less highly with the criterion than did the total scores on each 
test. 

6. Although only percentile scores were available for the Bernreuter 
Personality Inventory, it appears to correlate with the citvrion less well 
than the other tests. 


Received May 23, 1945. 
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Teachers College Students and the Minnesota 
Multiphasic Personality Inventory 


Orpha Maust Lough 
State Teachers College, Fredonia, New York 


Between February 1944 and January 1945, 202 students at a New 
York State Teachers College took the Minnesota Multiphasic Personality 
Inventory in connection with the course in Child Development. The 
records of only the 185 unmarried, women students, the majority of 
whom were Freshmen when they took the Inventory, were used in this 
study as the number of men students was too small to be significant. Of 
these women students, 94 were enrolled in the General Curriculum and 
the remainder were enrolled in the Music Curriculum. Those enrolled in 
the General Curriculum were preparing to be elementary school teachers; 
those in the Music Curriculum, public school music teachers. 

The purpose of this investigation was to determine: (1) if on the basis 
of the MMPI, there were any significant differences on any of the scales 
between those students enrolled in the Music Curriculum and those in the 
General Curriculum, or between these teachers college students and the 
general population as reported by the authors of the Inventory; (2) 
whether such an Inventory might be useful in the selection of students 
for admission to the teaching profession; (3) whether the Inventory 
indicates in these teachers college students the probabilities of developing 
those types of maladjustments which have been claimed to be predomin- 
ant in various studies of school teachers. 

The Minnesota Multiphasic Personality Inventory is a technique 
developed at the University of Minnesota and published in 1943. It is in- 
dividually administered and consists of five hundred fifty statements in 
simple language, each printed on a separate card, covering a wide range 
of subjects including physical condition, morale, vocational interests, and 
social attitudes. The subject sorts these statements into three categories, 
—‘True,” “False,” “Cannot Say.” The decisions are recorded on a 
printed Record Sheet according to instructions given in the Manual, and 
scored on twelve different scales: three validating scales; The Question 
Score (?), The Lie Score (L), The Validity Score (F); and nine diagnostic 
categories: The Hypochondriasis Scale (H,), The Depression Scale (D), 
The Hysteria Scale (Hy), The Psychopathic Deviate Scale (Pa), The 
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Interest Scale (M;), The Paranoia Scale (P.), The Psychasthenia Scale 
(P.), The Schizophrenia Scale (S.), and The Hypomania Scale (M,). 


Findings 
The ages of the subjects in this study ranged from sixteen years 
to twenty-three years, with a mean age of 18.8 years. The mean age 
of those in the General Curriculum was 19.0 and those in the Music 
Curriculum, 18.6. These students were given the Revised Alpha Exam- 
ination Form 5. The mean percentile rank on the Alpha for both groups 
taken together was 94.0; for the General Curriculum 93.2 and for the 
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Fic. 1. T-score profiles on the Minnesota Multiphasic Personality Inventory for 
Music Students and General Students enrolled in a New York State Teachers College. 
Note: solid line = 74 General Students; dotted line = 111 Music Students. 


Music Curriculum 94.4. Neither of these differences between the means 
is statistically significant (see Table 1). 

Figure 1 shows graphically the profiles for the music students and the 
general students based on the means of the T-scores for each scale. 

Both of the profiles approach a fairly straight line at the T-score mean 
level of 50 although they are a little higher generally than the one reported 
by Schmidt (15) for 98 normal men. The profiles tend to be fairly similar 
although some differences may be noted. The means of the music 
students on the psychasthenia scale is lower than that of the general 
students. The lowest point on the profile of the music students is the 
hypochondriasis scale and the highest point, the hypomania scale. The 
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means of the general students on the hysteria scale and depression scale 
is lower than those of the music students. The lowest point in the 
profile of the general students is the psychasthenia scale and the highest 
is the hypomania scale. Both groups are approximately one-half a 
standard deviation above the mean level on the hypomania scale. 
Schmidt (15) found a decrease on the hypomania scale for all profiles of 
both the normal and clinical groups of men. 


Table 1 


Comparison of the Teachers College Students Enrolled in the Music Curriculum and 
in the General Curriculum on the Separate Scales of the Minnesota 
Multiphasic Personality Inventory 

















Critical 
Ratio 
Gen. Curr. Music Curr. Between All Students 
N = 74 N = lll Gen. Curr. N = 185 
and — 
Mean §.D. Mean §&.D. Music 
Range My; M: Range Ms M;: Curr. Mean §.D. 
Age 22-5 19.0 1.33 23-1 18.6 1.22 21 18.8 1.26 
to to 
17-3 16-8 
Rev. 
Alpha 71-99 93.2 6.10 76-99 944 4.84 16 94.0 5.33 
Form 5 
M.M.P.I. 
? 50-66 513 3.10 50-66 513 £3.16 .008 513 3.14 
L 50-66 51.7 3.24 50-66 51.7 3.86 .000 51.7 3.63 
F 50-73 52.7 5.02 50-70 526 4.92 .018 52.6 4.97 
H, 37-67 45.7 7.91 37-79 48.1 8.02 .210 47.1 7.98 
D 28-73 47.1 12.06 32-71 49.3 8.53 .150 48.4 9.42 
H, 36-80 52.5 9.03 33-77 52.9 8.72 .030 52.7 9.05 
Pa 35-91 504 12.87 35-82 49.3 9.85 071 49.7 11.16 
M: 30-74 50.9 9.25 26-74 49.9 9.25 076 50.3 9.25 
P, 33-79 51.7 8.46 33-82 52.5 8.84 .072 52.2 8.69 
Pp, 3490 49.9 8.90 36-75 47.9 9.86 150 48.7 9.21 
S 39-83 50.7 9.32 37-74 51.8 9.23 .080 51.4 9.27 
M, 37-72 546 8.31 30-84 54.5 11.22 .004 54.6 10.14 





Table 1 presents the ranges, means, standard deviations for each 
group and for all the students, together with the difference of the means, 
the standard errors of the difference, and the critical ratios between the 
two groups. 

From the data given in Table 1, it is apparent that there is no signi- 
ficant difference on any of the scales between the students enrolled in the 
two curricula as none of the critical ratios is three or more. 
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According to the manual of instruction accompanying the MMPI, 
normal persons do not often score above 70; but if the environmental 
pressure is small, or if other personality factors are favorable, a person 
may score over 70 and yet escape need for special attention. Table 2 
shows the percentage of these students with T-scores over 70 on the 
separate scales of the Inventory. 


Table 2 


Percentage of the Teachers College Students with T-scores above 70 on the Separate 
Scales of the Minnesota Multiphasic Personality Inventory 














Per cent of 74 Per cent of 111 Per cent of 185 
Gen. Stud. Music Stud. Students 
Scale with T-scores with T-scores with T-scores 
M.M.P.I. above 70 above 70 above 70 
? 0 0 0 
L 0 0 0 
F 1.3 0 5 
H, 0 1.8 1.1 
D 2.6 9 1.6 
Hy 4.0 5.4 4.9 
4 6.5 2.7 3.8 
M: 4.0 1.8 2.7 
P, 5.4 4.5 4.9 
P, 1.3 2.7 2.2 
8. 5.4 3.6 3.8 
M, 4.0 10.0 7.6 





On the basis of the percentage scoring above 70 on the various scales 
of the Inventory as given in Table 2, it would appear that there may be 
some personality differences between the students in the two curricula. 
The extremely high scores among the general students were on the 
psychopathic-paranoia-schizophrenia scales while among the music 
students, the extremely high scores were on the hypomania-hysteria- 
paranoia scales. When the two groups were combined, the highest 
percentage of the T-scores over 70 were on the hypomania scales and in 
decreasing order of percentage, on the hysteria, paranoia, psychopathic 
deviate and schizophrenia scales. 

From these data it would appear that the teachers college students 
who took the MMPI were on the whole normal and stable. They show 
some slight disposition toward hypomania which is characterized by over- 
productivity in thought and action; ambition, vigor; activity and en- 
thusiasm, although somewhat depressed at times; inclination to under- 
take too many things at a time, to stir up projects and then lose interest 
in them; a disposition to disregard social conventions. There are ap- 
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parently no very significant differences between those enrolled in the 
music and in the general curricula except among those with T-scores over 
70. 

It may be that the slight tendency toward hypomania is characteristic 
of the late adolescent or young adult. It may be that this tendency is 
characteristic of women since Landis. and Page (10) report that the 
incidence of manic-depressive psychoses among men is but four-fifths as 
high as among women, while the incidence rate of schizophrenia is 15% 
higher among men than among women. 

In a study of seven hundred maladjusted school teachers who were or 
had been in hospitals for the mentally ill, Mason (12) found that the 
most common diagnosis among this group was dementia praecox and the 
second highest was manic-depressive psychosis. Thirty-seven per cent 
of the group suffered-from schizophrenia, 31% of the men and 40% of the 
women. Those diagnosed as suffering from manic-depressive psychosis 
constituted 24% of the group, 19% among the males and 26% among the 
females. The figures of New York state as a whole confirmed these 
findings that the largest percentage of teachers committed to state 
hospitals suffered from dementia praecox (27%) and the next largest 
clinical group was manic-depressive (14%). As shown by the present 
data, this group of women students preparing to be teachers showed 
little or no indication of schizophrenic trend while there was some slight 
evidence of manic-depressive tendencies. 

In a study by Harmon and Weiner (2) on the use of the MMPI in 
vocational advisement of disabled veterans they state that “elevation on 
Psychopathic or Hypomania scales indicates the type of personality most 
likely to adjust in jobs of a relatively undisciplined nature, where indi- 
vidual initiative and aggressiveness are at a premium, and which afford 
& maximum of variety in work processes, locale, or associates.”’ In the 
light of this statement it would appear that teaching may be a good 
vocation for those with hypomanic trends in as much as teachers who use 
modern methods in their work with children are continuously varying 
their work processes, need initiative and aggressiveness, and their work is 
fairly unregimented. On the other hand, a teacher who is even in the 
early stages of manic-depressive psychosis may have a great disorganiz- 
ing effect upon the children under her direction because of her instability, 
erratic behavior, and lack of wholesome, well-integrated personality. 


Summary 


It would appear from this study of 185 unmarried women students in 
a teachers college that, on the basis of the Minnesota Multiphasic Per- 
sonality Inventory: 
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1. They are a relatively stable, normal group with a very slight 
tendency toward hypomania. 

2. There are no significant differences between those preparing to 
teach in the elementary grades and those preparing to become public 
school music teachers. 

3. There may be some slight relationship between the hypomania 
trend found in these students and the large incidence of manic-depressive 
psychosis found among teachers who have been hospitalized. These 
students show no trends toward schizophrenia which was found to be the 
most common diagnosis among a group of seven hundred hospitalized 
teachers (12). As age appears to be a factor in the onset of psychoses 
(10), the fact that students now entering New York state teachers colleges 
are required to compiete a four-year curriculum, may eliminate some of 
those inclined toward schizophrenia but probably not those with a pre- 
disposition to manic-depressive psychosis. Such trends, however, might 
be among the many factors which contribute to the elimination of stud- 
ents between the Freshman and Senior years and between graduation 
and status as a classroom teacher. A follow-up study should be made in 
order to verify this conclusion. 

4. It may be, on the basis of this study together with that of Harmon 
and Weiner (2), that certain scales of the MMPI might be useful as one 
of the instruments in the selection of students for admission to the teach- 
ing profession. This research may provide normative data for guidance 
workers who are counseling incoming freshmen in other teachers 
colleges. However, a great deal more research on the personality char- 
acteristics of prospective teachers needs to be done before such a program 
is instituted. 

Received June 13, 1945. 
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The Occupational Adjustment Characteristics of a Group 
of Sexually Promiscuous and Venereally 
Infected Females* 


Robert D. Weitz 
Jersey City, New Jersey 


This study is the second of a series in which the writer is engaged in 
connection with a program of social and vocational rehabilitation of sexu- 
ally promiscuous and venereally infected females treated at the Mid- 
western Medical Center,—an intensive treatment center of the United 
States Public Health Service. 

It was pointed out in the first study (9), based on 500 cases, that the 
patients treated were generally below normal intelligence, the majority 
falling in the defective and borderline defective mental range. It was 
further pointed out that the group as a whole was more than a full year. 
retarded scholastically, the eighth grade being the median level completed. 

This study is concerned with the occupational adjustment which 
generally characterized the group. As a basis of comparison, the data 
obtained were compared with those of a corresponding age group of un- 
selected female job applicants who applied for work through the United 
States Employment Service (USES) in St. Louis during the years of 1941 
and 1942. 


The Problem 


By virtue of the fact that many writers have dealt with the subject 
of work as a socio-economic factor in the incidence of venereal disease, it 
was the purpose of this study to ascertain whether there were reliable 
differences in the occupational adjustments of a group of sexually pro- 
miscuous female patients treated at the Midwestern Medical Center and 
a group of ostensibly normal girls. Occupational adjustment was con- 
sidered in terms of two factors: (1) the nature of the job, as indicated by 
occupational title! and (2) job stability—as measured by the longest 
consecutive numberof months that each subject was gainfully employed 
by a single employer. 


* This report is based on a research study conducted by the writer while affiliated 
with the United States Public Health Service. Appreciation is extended to Virginia 8. 
Lenobel, psychological assistant, for her assistance in the testing program. 

1 The occupational titles assigned to the subjects of both groups were based on the 
principles of job classification embodied in the Dictionary of Occupational Titles (4). 
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Related Studies 


The literature of the earlier studies of wayward females in general 
indicates again and again evidence of vocational maladjustment and the 
need of guidance. 

In a study of 181 delinquent females who had been committed to the 
Illinois State Training School for Girls, Abbott and Breckenridge (1) 
found that relatively few of the subjects had worked at or had prepared 
for jobs requiring skill or training. Domestic work, waitress work and 
other unskilled jobs characterized the usual type of employment. It was 
also observed that frequent change of job was commonplace in the work 
histories of these girls. Broughton (2), in discussing the relationships 
between jobs and prostitution pointed out that the girls who headed for 
the ‘‘oldest profession” have had few opportunities for decent jobs. He 
stressed the need for vocational guidance and vocational education in the 
redirection of the girls who might become the next crop of prostitutes. 
In the same vein, Parran (6), commenting on the reduction of prostitutes 
in Russia, pointed out that the girls apprehended there are given trade 
training as well as medical treatment to aid in their reclamation. 

In a study reported by De La Caro (3), of a group of adolescent girls, 
mostly prostitutes, interned in Caguas, Puerto Rico, the need for voca- 
tional orientation was again pointed out. De La Caro stated: 

“The facts obtained from this study prove that a large proportion of these 
girls are in a favorable condition for a possible rehabilitation and that a coor- 
dinated program of services could save them. The venereal disease hospitals 
have already made provisions for a program of recreation and vocational 
orientation in the hospital, but this is not sufficient. The period of hospitali- 
zation of these patients is generally much too short to assure their permanent 


social rehabilitation. We believe it is the responsibility of the community to 
continue this work.” 


Ness (5), in his talk at the Puerto Rico Regional Conference on Social 
Hygiene, held February 1944, clearly indicated that vocational reorienta- 
tion was highly necessary as part of a more general social rehabilitation 
of sexually promiscuous females treated for venereal disease. He called 
attention to the “‘revolving door’ theory, i.e., letting the girls go back, 
at the completion of their medical treatment, to the same conditions that 
helped produce their maladjustment and disease. 

In a recent study by Rachlin (7) of a mixed group of colored and white 
Midwestern Medical Center patients who had been treated prior to the 
group included in the present study, he found that it was difficult to deter- 
mine their earning power, because the subjects had not worked steadily. 
In listing the jobs held by the girls, he showed that waitress, factory 
laborer, housekeeper and other unskilled jobs were most common to the 
group. 
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Like the authors of the other studies referred to above, Rachlin main- 
tains that venereal disease can be eradicated only by a closely coordinated 
medical and social rehabilitation program. It is in this vein that he 
recognizes the need for vocational guidance and reorientation. 


The Subjects 


The group which served as the basis for comparison was comprised of 
225 female job applicants, all residents of St. Louis, who had filed job 
applications with the USES. Their chronological ages ranged from 17 to 
30 years inclusive with a mean level of 22.9 years. A similar number of 
cases was selected from the original group of the 500 sexually promiscuous 
and venereally infected females treated at the Midwestern Medical 
Center to match as closely as possible the age distrubitions of the USES 
group. These cases, too, were all residents of St. Louis. In order to 
obtain the 225 cases for the hospital group, it was necessary to extend 
their age range to 35 years. The mean age level for the cases included in 


Table 1 


The Comparative Age Distributions of the Midwestern Medical Center Patients and 
the USES Job Applicants Included in the Study 

















; Hospital Cases USES Cases 
in 

— No. % No. % 
35-35.9 1 44 
34-34.9 2 89 
33-33.9 3 1.33 
32-32.9 3 1.33 
31-31.9 3 1.33 
30-30.9 4 1.78 13 5.78 
29-29.9 6 2.67 11 4.89 
28-28.9 6 2.67 7 3.11 
27-27.9 8 3.56 12 5.33 
26-26.9 9 4.00 17 7.56 
25-25.9 8 3.56 10 4.44 
24-24.9 13 5.78 12 5.33 
23-23.9 9 4.00 9 4.00 
22-22.9 34 15.12 14 6.22 
21-21.9 25 11.11 27 12.00 
20-20.9 29 12.89 33 14.67 
19-19.9 22 9.78 20 8.89 
18-18.9 38 16.88 38 16.89 
17-17.9 2 89 2 89 
Totals 225 100.00 225 100.00 
Mean 22.8 22.9 
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Table 2 


The Reliability of the Differences Between the Groups in Chronological Age and 
Duration of Longest Job 








Hospital Group USES Group 





M 8.D. M 8.D. D 





Chronological Age 


in Years 22.8 3.99 22.9 3.83 17 37 46 
Longest Job in 


Months 15.1 17.01 31.4 33.22 16.31 2.88 5.66 





this group was 22.8 years, virtually the same mean as found for the 
USES cases. 


To reduce the influence in job adjustment which might be attributed 
to race differences, only white girls were included in this study. 
The Findings 
That the hospital group and the USES group were closely matched 
for age is seen in Table 1. The mean age for the hospital cases was 


22.8 years, with a standard deviation of 3.99; whereas, the USES job 
applicants showed a mean age of 22.9 years and a standard deviation of 


Table 3 
A Comparison of the Classified and Unclassified Subjects Found in Both Groups * 








Hospital Group USES Group 
Subjects No. % No. % 











Classified 95.0 158 70.0 
Unclassified 11 5.0 67 30.0 


Totals 225 100.0 225 100.0 





* The subjects with one month or more of gainful experience in a single occupation 


were designated as classified; those with less than one month were designated as 
unclassified. 


3.83. As seen in Table 2, the critical ratio ? determined for these age 
differences (CR = .46) indicates that there was no significant difference 
between the mean age of the matched groups. 

Table 3 reveals that 214 of the 225 hospital patients studied, or 95 
per cent, had experience of at least one or more months in a single job; 


* A significant difference between groups requires a critical ratio ( 
or higher (8). 
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whereas, for the 225 USES job applicants it is seen that only 158 of the 
225 cases studied, or 70 per cent, had work experience of a month or more 
on asingle job. This difference was due to the fact that the USES group 
included several girls who were recently out of school and were entering 
the field of work for the first time. 

Comparing the subjects of both groups wherein job titles were desig- 
nated (i.e., where the individual had one month or more of gainful 
experience in a single job) it is seen in Table 4 that the mean work period 

















Table 4 
A Comparison of the Longest Work Periods for the Occupationally 
Classified Subjects 
Lo Hospital Cases USES Cases 
Work Period d 

in Months No. % No. % 
151-160 2 1.27 
141-150 2 1,27 
131-140 0 .00 
121-130 1 .63 
111-120 1 A7 3 1.90 
101-110 1 AT 1 .63 
91-100 2 93 3 1.90 
81-— 90 1 A7 3 1.90 
71-— 80 0 .00 4 2.53 
61— 70 0 .00 2 1.27 
51- 60 2 .93 Q 5.70 
41- 50 6 2.80 8 5.06 
31- 40 7 3.27 14 8.86 
21- 30 23 10.75 28 17.72 
11- 20 46 21.50 22 13.92 
1- 10 125 58.41 56 35.44 
Totals 214 100.00 158 100.00 
Mean 15.1 31.4 
8.D. 17.0 33.2 





for the hospital group was 15.1 months, with a standard deviation of 
17.01. For the USES group the mean work period was 31.4 months, 
with a standard deviation of 33.22. The critical ratio as shown in Table 
2, for these cases was 5.66, indicating a statistical significant difference. 

A comparison of the groups based on the occupational classifications 
assigned to the subjects is shown in Table 5. Outstanding among the 
findings revealed in this table are the following: (1) the complete absence 
of the hospital patients on the professional and managerial level, (2) 
their comparatively poor representation in the skilled and semi-skilled 
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Table 5 
The Comparative Distributions of the Occupational Classifications of the Groups 
Hospital Cases USES Cases 
Occupational Classification No. % No. % 
Professional and Managerial 0 .00 5 3.16 
Clerical and Sales 16 7.48 15 9.49 
Service 93 43.46 11 6.96 
Agricultural, fishery, forestry, etc. 2 .93 0 .00 
Skilled 21 9.81 33 20.89 
Semi-skilled 42 19.62 57 36.08 
Unskilled 40 18.70 37 23.42 
Totals 214 100.00 158 100.00 





trades and (3) the preponderence of the hospital patients in the service 
occupations as, for example, waitresses, houseworkers, bar-maids, etc. 


Summary and Conclusions 


On the basis of the findings of this study, it would appear that the 
sexually promiscuous and venereally infected patients who comprise the 
Midwestern Medical Center group differ from a group of ostensibly 
normal girls with reference to work adjustment. This is seen in the 
following: 


1. From the standpoint of length of service for a single employer, 
the hospital cases were not as stable as the USES group, the latter persons 
having worked twice as long on the average. 

2. The hospital patients were found much more frequently in jobs 
requiring little or no skill and training. This is evidenced by their com- 
plete absence on the professional and managerial level, their compara- 
tively poor representation in the skilled and semi-skilled trades and their 
relatively great distribution in the service occupations. 


No doubt there are many reasons why the girls, who are known to be 
sexually promiscuous, tend to manifest a generally inferior work adjust- 
ment. To be sure, low intelligence, emotional instability, economic in- 
security, or various combinations of these as well as other factors, may 
be included as basic causes in the total maladjustment syndrome of the 
sexually promiscuous female, but it is not within the scope of this paper to 
determine the causes of the maladjustment. It is rather the purpose 
here mainly to recognize that these individuals are different and are, 
consequently, in need of aid. 

Because of the fact that work plays so great a role in everyday life, 
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it is important that job adjustment be of prime consideration in the 
rehabilitation of the individual. An adequate vocational guidance 
program is therefore a necessary adjunct to any program, medical or 
otherwise, concerned with social reclamation. The role of the vocational 
counselor will, of course, be determined by the caseholding policy under 
which the institution operates. Where patients are held for medical 
treatment over a period long enough to permit adequate guidance work- 
up, including aptitude testing, the counselor alone can do a great deal 
toward the reorientation of the patients. On the other hand, where the 
latest intensive methods of therapy are used, and the patient turnover is 
rapid, the counselor can serve best by screening the patients and referring 
them to such community agencies as are available and willing to cooperate 
in the rehabilitation program. 


Received April 19, 1945. 
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The Effect of Prolonged Mild Anoxia on Speech 
Intelligibility * 


G. M. Smith 
College of the City of New York 


In two earlier papers in collaboration with C. P. Seitz (3, 4), it was 
demonstrated that speech intelligibility suffers a statistically reliable 
decrement at simulated altitudes as low as 16,900 ft., under certain con- 
ditions of initial difficulty. The decrement is a function of the initial 
intensity of the stimulus sounds, being greater for lower intensities. In 
view of the common practice in long-range bombing of flying to the 
region of the target at comparatively low altitudes without the use of 
oxygen masks, it was felt that an investigation of possible hearing losses 
under the stress of mild but prolonged anoxia might be of some interest. 
In the present study tests of speech intelligibility were made at intervals 
throughout a number of eight-hour sessions in a nitrogen dilution chamber 
in which an altitude of approximately 10,000 ft. was simulated.' 


Method and Materials 


The method of observing the effect of oxygen deprivation on the sub- 
jects’ ability to perceive speech sounds and the test materials employed 
were the same as those employed in the second study mentioned above 
(4). The subjects indicated their responses to the stimulus words on 
check lists. The stimulus materials were made up in part from standard 
word lists developed by the Bell Telephone Laboratories, covering the 
more frequent sounds that occur in common speech. Other lists were 
constructed on the principles employed by the Bell Laboratories (2). In- 
telligibility for vowel sounds was tested by lists of monosyllables all 
having the same initial and final consonants in any one list; e.g., suit, sit, 
sat, set, etc. Consonant intelligibility was tested by similar lists, each 
involving a constant vowel sound, but a variation in the initial or final 
consonant; e.g., nor, bore, yore, more, etc. 

To insure uniformity of stimuli the test items were recorded by means 


*I am indebted to the Linde Air Products Co. for a liberal grant of oxygen and 
nitrogen, and to Messrs. Mortimer Feinberg and Max Rosenbaum for valued clerical 
and statistical assistance. 

! The effects of the experimental variable on several other functions studied during 
the same sessions are reported elsewhere (5, 6, 7). 
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of the high fidelity equipment at the National Broadcasting Company 
studios in New York City.2 To minimize the effect of wear these re- 
cordings were later put in semi-permanent form by the RCA Manufactur- 
ing Co. at Camden, N. J.*. The recordings were played back through a 
Fairchild pick-up ‘ coupled with a Presto amplifier § (especially adapted 
so as to give a relatively flat response curve) and Western Electric ear- 
phones.*® 

The experiments were carried out in the nitrogen dilution chamber of 
the College of the City of New York, described in a previous paper (4). 
This maintains temperature and humidity at constant and comfortable 
levels and provides quite adequate sound-proofing. The simulated 
altitudes for the four experimental runs averaged 9,993 ft. (corresponding 
te an oxygen percentage of 14.3). The average altitudes maintained 
during the individual runs were 8,930 ft., 10,610 ft., 9,810 ft., and 10,620 
ft. Samples of the chamber air were taken at intervals of approximately 
one hour on every run and were analyzed by means of the Haldane- 
Henderson-Bailey gas analysis apparatus. The mean deviation from the 
general average of the 38 individual samples analyzed was 680 ft. On 
the four control runs there was, unfortunately, some over-compensation 
for the leakage from the experimenter’s mask, which caused a moderate 
climb from sea level during the runs, the average altitude for all four 
control runs being 1,810 ft. The mean deviation of the individual 
readings from this average was 1,510 ft. It is quite improbable that an 
altitude of this order, especially one simulated by reduced oxygen tension 
without change in total pressure, could have an adverse effect on hearing. 
Furthermore, the data from the four control runs indicated that on the 
runs for which the altitude was nearer to sea level the performance of the 
subjects was generally worse rather than better. It therefore seems 
justifiable to regard any altitude effects obtained on the intentional 
altitude runs as applicable to the 10,000 ft. level. 

The carbon dioxide problem was effectively solved by means of an air 
conditioning device which circulated the chamber air through tubes of 
Shell Natron under forced draft. The 65 individual CO, readings, for 
both the altitude and control runs combined, averaged 0.28%, with a 
mean deviation of 0.10%. 


*I am indebted to the National Bruadcasting Co. and to Mr. R, A. Lynn of the 
Engineering Dept. for their cooperation. 

*I am indebted to the RCA Manufacturing Co. and to Mr. W. L. Tesch of the 
Record Engineering Dept. for their courtesy. 

‘Turntable unit model 199 and pick-up model 209. 

* Model 87B. 


* Type 588A. 
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Procedure 


Twelve male college students, including several ASTP volunteers ” 
served as subjects. Their ages ranged from 17-20 years, with the median 
at 18 years. They worked in four groups of three, each group being 
tested both at the experimental altitude of approximately 10,000 ft. and 
at the control altitude of approximately 1,800 ft. for a continuous eight- 
hour run. The order of the altitude and control runs was reversed for 
alternate groups of subjects so as to minimize the effect of practice. The 
two runs were separated by an interval of one week. The usual pre- 
cautions to allay fear of the chamber were taken, the experimenter re- 
maining in the chamber throughout the run. To keep the suggestion 
factor constant for the two runs in which each subject participated, the 
procedure of the experimental run, including the manipulation of oxygen 
and nitrogen valves and the wearing of an oxygen mask by the experi- 
menter, was carefully imitated on the control runs. Judging by the 
subjects’ reactions, the deception was quite general. To relieve the 
tedium the subjects were permitted to read or study quietly when not 
engaged in taking tests. During the testing the subjects were comfort- 
ably seated and were equipped with standard Western Electric earphones 
such as are used by the American Airlines on transport planes.*® 

Record booklets made up of four separate tests, each containing 
check rows for 11 vowels and 24 consonants, were employed. This 
made a total of 44 vowel items and 96 consonant items. The order of 
the four tests was varied from one testing period to the next in order to 
minimize practice effects. The testing periods, which were approximately 
20 minutes in duration, came after average elapsed times in the chamber 
of % hr., 24% hrs., 434 hrs., and 634 hrs., approximately. Between the 
second and third testing periods a high protein standardized lunch 
intervened.® This began after an elapsed time of approximately 334 hrs. 
and was finished within % hr. 

The sound level of the stimulus words was the same as that used in the 
second study in collaboration with C. P. Seitz (4). This was set inten- 


71 am indebted to Colonel Raymond P. Cook for his cooperation in permitting the 
use of Army volunteers for this rather tedious ordeal. 

*Type 588A. The response curves of one set of phones was tested by the Stevens 
Institute of Technology through the courtesy of President H. N. Davis and Dr. H. 
Burris-Meyer. Though the curves are peaked rather sharply at 1,000 cps., they are 
relatively flat (+10 db.) between 2,000 and 8,000 cps. The transmission system as a 
whole was sufficiently free from distortion to make possible a ready identification of the 
speakers’ voices. I am indebted to the American Airlines and to Messrs. D. W. Rentzel 
and H. A. Wolfe for the use of the phone sets. 

* Two ham or cheese sandwiches and one pint of milk. For each subject the diet 
was the same during the experimental and control runs. 











4 
i 
uf 


258 G. M. Smith 













































tionally at a fairly low value '° so that the effect of anoxia, if any, might 
be more readily demonstrated. Specifically, the mean consonant articu- 
lation value (the per cent of correct responses) was in the range 50-60%, 
the exact value varving somewhat with the sample of subjects used. At 
this sound level, and with the same test materials and reproducing equip- 
ment as those employed in the present study, there was no reliable 
decrement in intelligibility observed after an exposure to a simulated 
altitude of 13,600 ft. for approximately one hour. However, in this 
earlier study a simulated altitude of 16,900 ft. did produce a reliable 
decrement. 


Results 


The principal data are summarized in Table 1, which gives the 
articulation values (the per cent of correct responses) for vowels, con- 
sonants, and standard syllables for each of the twelve subjects, for both 
the altitude and the control runs, for each of the four testing periods. 
Standard syllable articulation values were calculated from the vowel end 
consonant values by the Fletcher and Steinberg formula S = 1-(1-VC*)°*® 
derived empirically from extensive observations in the Bell Laboratories 
(2). The means and the probable errors of these means also appear in 
this table. As is to be expected, the articulation values for vowels are 
consistently better than those for consonants for all subjects under both 
altitude and control conditions. For all four periods the mean perform- 
ances under the control conditions are superior to the mean performances 
at altitude for each of the three criteria. The standard syllable criterion 
is the most meaningful since it takes into account both vowel and conson- 
ant articulation and most nearly approximates speech. The mean syl- 
lable articulation values for both the altitude and the control runs are 
plotted in Figure 1. This gives us an impression of maximum altitude 
handicap at periods II and III, 244 hrs. and 4%4 hrs. after the beginning 
of the run, respectively, and also suggests a marked end-spurt at the 
634 hr. mark. 

As a check on the impression given by this figure, Fisher’s t-statistic 
was calculated for each of the four periods to see whether the apparent 
differences were reliable. The results of these calculations are presented 
in Table 2. This indicates that the small differences which appear be- 
tween the altitude and the control performances at the 34 hr. and the 
634 hr. periods are quite unreliable. However, the differences at the 
2% hr. and 4% hr. periods, though not strictly reliable by rigorous stand- 


% Approximately 24 db. through the phones, against an external background sound 
level of approximately 70 db. in the chamber due to fans in temperature control and 
circulation systems, primarily. 
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Table 1 


Articulation Values for Vowels, Consonants, and Standard Syllables at 
Four Different Periods 








Period I 
Elapsed Time: ? Hr. 
Control Altitude 








Conso- Conso- 
Vowel* nantt Syllablet Vowel* nantf Syllablet 
% % % % % % 


75.0 47.0 15.0 75.0 34.5 9.0 
100.0 65.5 39.5 95.5 44.0 17.0 
97.5 59.5 32.0 93.0 45.0 17.0 
95.5 73.0 47.5 95.5 56.5 28.0 
100.0 64.5 38.5 47.5 19.0 1.5 
100.0 78.0 57.0 97.5 67.5 41.0 
97.5 63.5 36.5 97.5 86.5 69.5 
100.0 56.5 29.5 100.0 72.0 48.0 
95.5 56.5 28.0 97.5 57.5 29.5 
88.5 50.0 20.0 100.0 62.5 36.0 
70.5 37.5 9.0 95.5 55.0 26.5 
97.5 47.0 19.5 95.5 49.0 21.0 





93.1 58.2 31.0 90.8 54.1 28.7 
1.86 2.23 2.64 1.38 3.21 3.29 


Period II 
Elapsed Time: 2} Hrs. 
Control Altitude 


Conso- Conso- 
nant Syllable nant Syllable 


% % % % 


47.0 17.5 \ 35.5 10.0 
73.0 49.5 55.0 27.5 
62.5 . 49.0 20.5 
67.5 t ‘ 38.5 9.0 
70.0 . 58.5 30.0 
81.5 . 66.5 41.0 
79.0 \ 75.0 52.5 
62.5 . 69.0 44.0 
54.0 I 56.5 29.5 
36.5 11.5 52.0 25.0 
41.5 14.5 88.5 48.0 

49.0 21.5 88.5 54.0 23.5 











. 60.3 34.2 92.9 54.8 27.6 
1.05 3.01 3.46 1.72 2.10 2.40 
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Table 1—Continued 
Period III 
Elapsed Time: 43 Hrs. 
\ Control Altitude 
Sub- Conso- Conso- 
ject Vowel nant Syllable Vowel nant Syllable 
% % % % % % 
1 79.5 40.5 11.5 79.5 33.5 8.0 
2 97.5 72.0 47.0 100.0 51.0 24.0 
3 100.0 62.0 35.5 97.5 46.0 20.0 
4 100.0 70.0 45.5 95.5 55.5 27.0 
5 100.0 73.0 49.5 97.5 67.5 41.0 
6 100.0 73.0 49.5 95.5 74.0 48.5 
7 100.0 74.0 51.5 100.0 75.0 52.5 
8 100.0 61.5 35.0 93.0 73.0 46.0 
9 97.5 44.0 17.5 100.0 47.0 20.5 
10 97.5 45.0 18.0 95.5 36.5 11.5 
11 88.5 48.0 18.5 88.5 44.0 15.5 
12 95.5 54.0 25.5 97.5 51.0 23.0 
Mean 96.3 59.7 33.7 95.0 54.5 28.1 
P.Em 1.09 2.73 3.14 0.98 2.95 3.06 
Period IV 
Elapsed Time: 63 Hrs. 
Control Altitude 
| Sub- Conso- Conso- 
ject Vowel nant Syllable Vowel nant Syllable 
% % % % % % 
1 88.5 55.0 24.5 79.5 41.5 12.0 
2 97.5 69.0 43.0 95.5 65.5 38.0 
i 3 100.0 64.5 37.0 95.5 63.5 35.5 
4 100.0 76.0 54.5 95.5 61.5 33.0 
. 5 100.0 72.0 48.5 97.5 61.5 34.0 
] 6 100.0 78.0 57.0 100.0 66.5 41.0 
{ 7 100.0 67.5 42.5 100.0 82.5 64.0 
8 97.5 44.0 17.0 100.0 70.0 45.5 
; 9 97.5 56.5 28.5 86.5 60.5 29.0 
' 10 100.0 45.0 18.5 75.0 38.5 10.5 
ll 100.0 44.0 17.5 75.0 49.0 17.0 
12 91.0 51.0 21.5 97.5 60.5 33.0 
Mean 97.7 60.2 34.2 91.5 60.1 32.7 
P.E™m 0.65 2.67 3.15 2.03 2.08 2.71 
: 
* There were 44 vowel items. 


+ There were 96 consonant items. 
tS =1—- (1 — VC). 
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ards, are at least suggestive. The probabilities that the differences 
could have arisen on the basis of chance are .14 and. 07., respectively." 
These more nearly reliable values correspond to the higher percentages 
obtained when the ratios of control to altitude performances are cal- 
culated. The control performance is 24% better than the altitude per- 
formance after a 214 hr. exposure and 20% better after a 434 hr. exposure; 
whereas the control performance is only 8% better at the 34 hr. point 
and 5% better at the 634 hr. point. The relatively poor performance on 
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Fie. 1. Mean syllable articulation at the four test periods. 


the control run at the first test period may quite possibly be the result of 
the belief on the part of the subjects that they were at altitude. 

The question of the end-spurt in the performance at altitude at the 
634 hr. period deserves some consideration. This is so marked as to 
practically wipe out any indication of reduced capacity to perceive speech 
sounds at the 10,000 ft. altitude. It strongly suggests that the apparent 
altitude effect shown at periods II and III is not primarily an auditory 


"When these probabilities are combined by Fisher’s technique for combining 


probabilities from independent tests of significance (1), the value of the combined P 
becomes .06. 
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defect, but that it may be due in large part to a wandering of attention or 
a lessening of motivation engendered by the tedium and boredom of an 
eight-hour run in confined and uninspiring quarters. This hypothesis 
gains support from other data collected during the same runs reported in 
full elsewhere (5). The subjects periodically rated themselves with 
respect to their feelings of sleepiness, boredom, fatigue, depression, ir- 
ritability, and general well-being, and with respect to their motivation, 
attention, etc. Quite generally an end-spurt in the direction of better 
adjustment occurred in these subjective measures at the last period, when 
the subjects knew that their ordeal was almost over. 

It must not be thought, however, that there are no physiological 
effects from an eight-hour exposure to the mild anoxia encountered at a 











Table 2 
Comparison of Mean Syllable Articulation Values for Control and Altitude Runs 

Period I II IlI* IV 
Average Time Elapsed (in Hours) i 2} 43 63 
Mean for Control Run 31.0% 34.2% 33.7% 34.2% 
Mean for Altitude Run 28.7% 27.6% 28.1% 32.7% 
Difference (Con.-Alt.) 2.3% 6.6% 5.6% 1.5% 
t Value 0.4 1.6 2.0 0.03 
P (Probability that Difference is 

Due to Chance) .70 .14** .O7** 77 
Con./Alt. x 100 108% 124% 120% 105% 





* High protein standardized lunch intervened between periods II and III. It was 
started after an elapsed time of 3} hours and was finished within } hour. 

** When these probabilities are combined by Fisher’s technique for combining 
probabilities from independent tests of significance (1), the value of the combined P 
becomes .06. 


simulated altitude of approximately 10,000 ft. There was a quite 
general increase in sleepiness, feeling of fatigue, depression, and headache, 
and a decrease in the general feeling of well-being for the altitude runs as 
a whole, as compared with the control runs. Furthermore, there was 
quite reliable and impressive evidence that the angioscotoma, which is 
relatively free from subjective influence, increased in magnitude during 
the same altitude runs for the same subjects. These data are also re- 
ported in full elsewhere (6). It should likewise be borne in mind that the 
sensitivity of any criterion of speech intelligibility can be increased by an 
increase in the initial difficulty of the stimulus material at sea level. It 
is entirely possible that a clearer indication of diminished capacity to hear 
speech sounds might have been obtained had the stimulus words been 
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set at a lower sound level. It will be recalled that the sound level and 
the stimulus materials employed in this study were the same as those 
employed in an earlier study which failed to reveal any reliable decrement 
in speech intelligibility for an exposure of one hour to a simulated altitude 
of 13,600 ft., though a reliable decrement appeared for a similar exposure 
at 16,900 ft. 


Summary 


1. Using the method and materials employed in an earlier study in 
collaboration with C. P. Seitz (4), twelve subjects were tested for their 
ability to perceive standard speech sounds at four periods during an 
eight-hour exposure to the mild anoxia encountered at an altitude of ap- 
proximately 10,000 ft., simulated in a nitrogen dilution chamber. 

2. The decrement in speech intelligibility at altitude was very slight 
and unreliable at the 34 hr. period; it was nearly reliable at the 24% hr. 
and 4% hr. periods; but there was a marked lessening of the altitude 
effect at the last period, 634 hrs. after entering the chamber. 

3. The subjects’ ability to overcome the mild deterioration in per- 
formance exhibited in the middle of the run in an “end-spurt” suggests 
that the apparent loss of efficiency at the altitude and sound level em- 
ployed is primarily due to subjective factors such as wandering attention 
and boredom. The subjects did in fact report that there was a greater 
increase in sleepiness and boredom generally in the altitude runs than in 
the control runs. However, evidence derived from another aspect of the 
study reported elsewhere (6) indicates clearly that there is not a general 
absence of physiological involvement: there was on the contrary a reliable 
and progressive enlargement of the angioscotoma during the prolonged 
altitude exposure. Nevertheless, it seems improbable that significant 
losses in speech intelligibility will occur on prolonged bombing missions 
at altitudes of the order investigated ; for with properly functioning sound 
equipment the sound level is much higher than the one employed in this 
study. It is possible, however, that the subjective factors mentioned 
may cause errors in speech perception. 


Received April 15, 1945. 
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Studies in International Morse Code: V. The Effect of 
the “Phonetic Equivalent”’ 


F. S. Keller, I. J. Christo, and W. N. Schoenfeld 
Columbia University 


The present study originated in two closely related ideas, independ- 
ently formulated and suggested almost simultaneously by B. F. Skinner 
and M. Wertheimer (1). Each suggestion arose from the examination of 
a preliminary outline of a training method described earlier in this series 
(2), in which the Signal Corps “phonetic equivalents” of the alphabet 
were employed to identify the individual signals of International Morse 
code. 

With respect to this system of identification, both Skinner and 
Wertheimer proposed that phonetic equivalents be chosen on the basis of 
their formal similarity to the signals themselves, without discarding the 
distinctive cue furnished by the initial letter. Skinner pointed out that 
the natural tendency of the student to “echo” the auditory signal prior 
to the explicit written response might be utilized by providing equivalents 
having sufficient formal overlap with the signals to encourage the arousal, 
through “verbal summation,” of the equivalent itself which, in turn, 
provides the letter cue. “Thus --- (S) will no longer lead to the 
echoic ‘di-di-dit’, which is of little value, but to ‘Sicily’ which gives the 
letter.” 

Wertheimer’s proposal was similar in suggesting the use of “structu- 
rally appropriate” equivalents, that is, “words which have the same 
rhythm, inner grouping, accentuation, length-hierarchy”’ as the signals, 
to facilitate the learning, aid against forgetting, help in recall, and 
avoid confusion of the signals with one another. Wertheimer em- 
phasized the “structural isomorphism” of signal and word, and reported 
the results of an exploratory experiment which apparently demonstrated 
the advantage of isomorphic equivalents in both learning and retention. 

On the strength of these proposals, it was decided to adapt the general 
idea to the code teaching procedure referred to above. The tentative 
list of equivalents offered by Wertheimer for nineteen of the signals; 
those offered by Skinner for the entire twenty-six; the Signal Corps 
equivalents; and the words finally selected for the present experiment, 
are shown in Table 1. The final selection was made as follows: about 
seventy-five experienced code students were asked to vote on the iso- 
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Words Used as Morse Code Signai “Equivalents” 
Note: No equivalents are used for digit signals. 


Table 1 














Alpha- Code Signal Final 
bet Signal Corps Skinner Wertheimer Selection 
A ‘- Able Around Ahoy Around 
B mt Baker Beautifully Boomalacha Beat Germany 
Cc —— Charlie Chattanooga Coca Cola Casa Blanca 
D - Dog Dominic Daintily Dog did it 
E Easy Eek Ebb Eek 
F - Fox Federation Forestation Federation 
G -- George Gold goggles Gamekeeper Gamewarden 
H How Hilly-billy Helter-skelter Hilly-billy 
I Item Itchy — Itchy 
J --- Jig Jemima’s jam — Japan sand man 
K -- King Kangaroo — Kangaroo 
L - Love Legitimate Los Angeles Liberia 
M -- Mike Ma-ma Mainstay Ma-ma 
N - Nan Naughty Nasty Nazi 
Oo --- Oboe Oh-oh-oh Oh my dear Oh-oh-oh 
P -- Peter Prefer posies Police station Police station 
Q ---— Queen Quadruplicate _ Quadruplicate 
R - Roger Revolting Removal Revolver 
8 Sugar... Sicily Sicily Sicily 
Tare Toot Tea Toot 
U . Uncle Unafraid Uncle Sam Unafraid 
Vv ed Victor Victory now Victory soon Victory now 
Ww --- William Without funds With all might Without arms 
xX = X-ray Xylophone band _ Excellent work 
Y --- Yoke — yacht — Yankee rampart 

u 

Z -—+: Zebra — Zulu did it 





morphic suitability of the Skinner-Wertheimer words, together with a 
number of alternative equivalents. These words, which were presented 
two or three times in connection with their respective signals, were pro- 
nounced with a syllabic emphasis that was calculated to enhance the 
similarity of word and signal without providing a marked distortion of 
the former. The signals were transmitted at the same rate as used in 
actual training. While the equivalents finally selected may not wholly 
meet Skinner’s or Wertheimer’s standards, it was felt that taey should re- 
duce training time if the basic idea were sound, and that refinement might 
be postponed until their superiority was demonstrated. The aim of the ex- 
periment was not, of course, the confirmation of Wertheimer’s results, since 
the latter were obtained by a different method—one quite impracticable 
as a general training device. 
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Nineteen Columbia College undergraduates, ranging in age from 
seventeen to twenty-three years, were used as subjects in the present 
experiment. All were inexperienced in code; and all were given fifty 
minutes of daily training, as a code class, Monday through Friday, 
throughout the learning period. 

In evaluating the influence of the new equivalents, the results of two 
other experiments are available for comparison. In both of these, under- 
graduate code classes were used, and the training procedure was identical 
with that of the present study except that Signal Corps equivalents were 
employed. The first (3) will hereinafter be called Experiment I; the 
second (5), Experiment II; and the present study, Experiment III. 

The criterion of mastery in Experiments II and III was set as three 
successive 100-signal runs in each of which the student made no more 
than five errors (either of substitution or omission). The average 
number of runs up to the criterion in Experiment II was 23.2 (S.D., + 
12.8); in Experiment III, the average was 22.9 (S.D., + 8.95). The 
similarity of results with the two groups of subjects is equally clear when 
the cumulative progress curves for Experiments II and III are compared 
(see Figure 1). It is evident that, insofar as speed of learning is con- 
cerned, neither set of equivalents has the advantage. 
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Fig. 1. Progress curves for the students in Experiments II (solid line) and III 
(broken line). The data plotted represent the average per cent ef correct responses 
for the two groups on successive training “runs” of 100 randomized signals each. Prior 
to the first run, the 36 signals were identified, hence the curves do not start at zero; 
95% correct performance was accepted as mastery. The curve for Experiment III 
reaches the 95% criterion on run 25, that for Experiment II on run 30. 
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Table 2 


Rank Order Correlations (with PE’s) of the 36 Signals in Experiments I, II, and III 
Note: The correlations are for rank order of signal difficulty, as based upon sub- 
stitution, omission, and total errors. 








Exp. I-Exp. II Exp. I-Exp. III Exp. Il-Exp. III 





Substitution Errors +.85 + .03 +.78 + .05 +.78 + .05 
Omission Errors +.93 + .02 +.93 + .02 +.92 + .02 
Total Errors +.95 + .01 +.90 + .02 +.92 + .02 





In Table 2 are presented the rank order correlations of the thirty-six 
signals, based on substitution, omission, and total errors in Experiments 
I, II, and III. These correlations indicate that (1) the rank orders for 
total and omission errors are practically identical in the three experi- 
ments, but, (2) in terms of substitution errors, the new equivalents have 
an effect, as shown by the depressive action of Experiment III upon the 
correlations. (The stability of the omission errors probably arises from 
the fact that these errors occur mainly in the very early runs, before the 
equivalents exercise their influence, and while discriminative failure due 
to stimulus generalization is primary.) 

A better notion of the shift in substitution errors may be obtained 
from Table 3 wherein are given the rank order correlations between the 


Table 3 


Eight Cases in which Substitution Errors Common to a Signal in Experiments II 
and III were Ranked as to Frequency and Correlated 








‘ 


Rho + P.E. 





+.42 + .19 
+.58 + .15 
+.24 + .21 
+.16 + .19 
+.01 + .21 
—.07 + .22 
+.03 + .22 
+.84 + .12 
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main substitution errors in Experiments II and III for eight of the thirty- 
six characters which were selected because the correlations for them could 
be based on at least ten substitutions, with each substitution occurring 
at least ten times. Thus, for “P’’, there were twelve identical substitu- 
tion errors, each made at least ten times, in the two experiments, giving 
two rank orders which correlated +.42 +.19. 
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Rank order correlations, of course, mask differences in absolute 
frequencies. For example, it was found that “C” was substituted for 
“F” twenty-nine times in Experiment II and 136 times in Experiment 
III. A percentage comparison revealed that this substitution accounted 
for nine per cent of all the substitutions for ‘“F” in Experiment II, and 
forty-nine per cent in Experiment III, the difference being highly signi- 
ficant. Other cases of shift in absolute frequency which met the test 
for significance of percentage differences included the “K”’ substitution 
for “U’’, the “F” for “C”, the “C” for “F”, and the “U” for “K”. A 
possible reason for these substitutions will be offered below. 

It has been argued elsewhere (5) that the difficulty of signals in 
International Morse code is primarily a matter of stimulus generalization ; 
that certain signals, due to the possession of common properties, give 
rise to identical responses, thus to “errors” in code reception. The 
importance of the auditory stimulus similarity is undeniable, but it is 
now clear that, under the conditions of training that prevailed in these 
experiments, the response to a signal is in some measure due to another 
factor—namely, the nature of the word employed to identify a signal. 
More generally, it may be said that, subsequent to the presentation of a 
signal-stimulus, there takes place some form of activity on the part of 
the code learner which works in a supplementary fashion to determine the 
final, written response. 

Both student reports and experimenters’ observations support the 
assertion that, regardless of the specific training procedure, the beginning 
code student does something in the interval between the signal presentation 
and the written (or spoken) response. He may tap with his pencil, his 
finger, or his foot; he may whistle softly, or whisper, to himself; he may 
shake his head and, under strong motivation, seem fairly to bounce in 
his seat. When no overt activity occurs, he will commonly report sub- 
vocal, sub-gestural, and visualizing activity, or say that he re-hears or 
echoes the signal before “responding” to it by writing (or speaking) a 
character. It is also to be observed that during code mastery the 
amount of this ‘intervening activity’ decreases, first for the signals that 
generalize least, and last for those showing strong and widely distributed 
generalization. 

It has been noted above that, in Experiment III, there was a marked 
increase in the frequency of certain errors. This may be related to the 
tendency of the subject to retain his own stress pattern for certain identi- 
fying words in spite of the experimenter’s attempt to make these words 
more nearly isomorphic. “Federation” and ‘Casa Blanca’ were fre- 
quently confused; so were “Unafraid” and “Kangaroo,” and so forth. 
In effect, these were “poor” isomorphs. It does not, however, appear 
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likely that the best of isomorphs would entirely avoid the distortion 
introduced by the already-formed differentiations in the individual 
subject’s language behavior. 

In connection with an earlier attempt to analyze the errors made in 
code learning (4), a few outstanding types of substitution have been 
described: (1) reversal errors, in which a signal is mistaken for its ‘mirror 
image’; (2) inversion errors, in which a signal is mistaken for one having 
the same number of components, but in which dot replaces dash and 
dash replaces dot; and (3) dotting errors, in which a signal is mistaken for 
one having a smaller or, less frequently, a larger number of dots. A 
comparison of the frequency of occurrence of these error types in this 
experiment with the frequency in Experiments I and II reveals a note- 
worthy decrease in the percentage of reversal errors in this experiment. 
The attribution of this change to the effect of the new equivalents is 
supported by the fact that no similar change was observed for the digit 
signals, which have no equivalents. 

The failure of the present study to find any advantage of the isomor- 
phic equivalents with respect to learning time may conceivably be re- 
lated to the inadequacy of the name-words chosen. Whether better 
equivalents would bring better results is questionable. The present 
writers believe that no great advantage is likely to accrue from a more 
careful choice of words. The speech habits of years’ standing—the 
vocal differentiations already established—in the average student of 
code are bound to intrude in a fashion that will often lead to erroneous 
response. 


Received June 6, 1945. 
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Validity of the Hunt-Minnesota Test for Organic 
Brain Damage 


Rachel F. Malamud 
Psychological Laboratories, Norwich State Hospital 


Clinical psychologists are frequently called upon to aid the psychi- 
atrist with the practical, clinical problem of determining the presence or 
absence of organic brain damage in an individual patient. A number of 
psychologists and psychiatrists have devised tests bearing on this prob- 
lem, but none have been entirely satisfactory. One of the more recent 
attempts is the Hunt-Minnesota Test for Organic Brain Damage (1). 
To what extent this test can fulfill its purpose is the subject of this paper. 

The Hunt-Minnesota Test consists of three major divisions: the 
vocabulary test which is relatively insensitive to brain damage, a group 
of tests sensitive to deterioration, and a group of interpolated tests. The 
subject’s Stanford-Binet vocabulary score, in relation to his age, deter- 
mines the score level at which he is expected to perform the more sensitive 
tests. The deterioration tests, consisting of pairs of words and of de- 
signs which the subject is required to associate and later recall or recog- 
nize, determines the level at which he is actually functioning. The 
amount of discrepancy between the subject’s expected score and the score 
he actually makes on the word and design associations is the basis for the 
diagnosis of brain damage. The discrepancies are indicated by T scores; 
those T scores which fall higher than a certain critical point are con- 
sidered to be indicative of organic damage. In an effort, which Hunt (4) 
describes as “partially successful,” to make sure that these high T scores 
are produced by brain damage alone and not by factors of inattention or 
poor motivation, he includes in the battery nine short interpolated tests 
of attention and cooperation. The patient who passes all, or nearly all, 
of these tests is considered probably capable of being examined validly 
by the test proper. The final decision on the test’s validity for the 
individual subject, however, is left to the judgment of the examiner. 


Hunt’s Results 


Hunt found his test highly satisfactory in differentiating 33 known 
organic cases from 41 non-organic control subjects. Using a critical T 
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score of 68! and ignoring one “doubtful” normal, he found that only 
two organic cases and one non-organic case were misclassified. Even 
though the total number of subjects used for standardization was small, 
Hunt (2) felt able to conclude that “‘. . . the validity of the test battery 
as a discriminating instrument is statistically established.” Subsequent 
use of the test in the Neuropsychiatric Clinic of the University of Minne- 
sota Hospitals also proved encouraging. Hunt (3) states in a second 
paper, “. . . T scores above 60 probably justify suspicion of pathology. 
For practical purposes, a T score of 66 (not 68 as suggested in the manual) 
should be considered as the ‘critical score’ dividing the normal from 
abnormal performances.” 


Problem 


To the members of the Norwich State Hospital Psychological Labor- 
atories the test appeared to have outstanding advantages. It was simple 
to administer and score; it took only 15 to 30 minutes to administer; it 
seemed to be satisfactorily validated; and, most important of all, it pro- 
vided a means for obtaining quantitative evidence of organic brain 
damage. The test was, therefore, immediately put to use with members 
of the psychology department as the first subjects. The results from 
this preliminary use of the test were striking. Six of the ten members of 
our department were apparently suffering from brain damage. These 
results naturally led us to question the test’s validity. The following 
problem was formulated for systematic study: To what extent does the 
Hunt-Minnesota Test produce “‘false positives” by designating normals as 
having organic brain damage? 


Procedure 


A total of 64 subjects, all employees of the Norwich State Hospital 
were tested with either the long or the short form of the Hunt-Minnesota 
Test. The majority took it as part of a routine battery given to new 
employees. All subjects fulfilled Hunt’s minimum requirements of 
ability to speak and read the English language, school attendance of at 
least three grades, adequate muscular coordination and sensory acuity, 
and a mental age of eight or more. Those given the long form passed 
all the interpolated tests for attention and motivation. In administering 
the tests the examiner followed very closely the administration procedure 
outlined in the test manual and at various times was carefully observed 


1 Hunt actually seems to have ignored the two control cases and one organic case 
' scoring at 68. Of his organic cases, all those falling at or below 67 he considered mis- 
classified, but of the control subjects all those falling at or above 69 he considered 
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and checked in her examination technique. A stop-watch was used for 
timing although Hunt allows the use of simple counting. All test rec- 
ords were carefully rescored for accuracy by another psychologist. 


Results and Analysis 


Thirty-five or 54.7 per cent of these 64 normal subjects were found to 
have scores indicating organic brain damage. These results obviously 
differ greatly from the 9.8 per cent of “‘organic’”’ scores which appear 
among Hunt’s normal subjects when a critical score of 66 is used. Figure 





64 Norma HospirTau Hunt’s 41 Controu 
EMPLOYEES SUBJECTS 
T Scores 
26-30 x 
Normal Scores 31-35 xxx Normal Scores ° 
45.3% 36-40 xxxx 90.3% 

41-45 xxx 

46-50 

51-55 

56-60 

61-65 


66 
67 
“Organic” Scores ? 68 “Organic” Scores 
54.7% 69 9.7% 

70 

71 

72 

73 

xxx 74 

75 

x 7 

oxx 77-81 

xxxx 82-86 

xxxxx 87-91 








Fic. 1. Comparison of the distribution of 64 normal hospital employees with 
Hunt’s 41 control subjects. (x = long form and o = short form.) 


1 illustrates the marked contrast between the two distributions.? It will 
be noticed that our subjects scattered over such a wide range of T scores 
that even if the critical score had been 67 or 68, the percentage of “‘or- 
ganic” scores would not have been greatly reduced. Although we could 


* The class intervals are the same as those used by Hunt (2). The T scores below 
50 in our distribution were estimated by extrapolation. 
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not conclude that so large a proportion of normally functioning persons 
have organic brain damage, neither did we immediately conclude that 
the test was totally invalid. In the hope of accounting for the discrep- 
ancy between our data and Hunt’s, we examined out data for possible 
clues. Two characteristics of our data appeared worthy of considera- 
tion. 

First, we noted that 29 persons of our total group of 64 had vocabulary 
scores higher than the upper limit of 32 words for which Hunt says the 
test is maximally efficient. To determine whether it was these cases 
which produced the high percentage of ‘“‘organic” scores, we analyzed 
only the 35 cases originally within the maximal vocabulary range. Of 
these we found that 57.1 per cent had “organic” scores. The 29 high 
vocabulary records studied alone showed 51 per cent, or about the same 
per cent of abnormality as did all the cases combined. By arbitrarily 
reducing all vocabulary scores of 33 words or more to 32 words, 42.2 per 
cent of the total group still remained in the pathological category. Obvi- 
ously, it was not the superior vocabularies which accounted for the high 
percentage of “organic” scores. 

Secondly, 14 of the 64 cases had been given only the short form. To 
determine whether these cases unduly affected the total percentage of 
“organic” scores, we eliminated the 14 records and derived percentages 
on the remaining 50 “long-form” subjects. Again it was found that the 
distribution of scores was not greatly changed. Sixty per cent of the 
“long-form” subjects had “organic” scores. Even by reducing all the 
superior vocabularies of the “long-form” group to 32 words, 48 per cent 
still remained in the pathological category. By analyzing the 14 cases 
given only the short form, and adding to them 23 cases of the “‘long- 
form” group on which we had been able to obtain short form scores, we 
found that 48.6 per cent had “organic” scores. Even after reducing the 
superior vocabulary scores, 40.8 per cent remained “organic.” The 
short form, then, approximates the long form in the adequacy (or in- 
adequacy) of its discrimination. 

In a private communication with the author, Hunt offered the hy- 
pothesis that some of our subjects failed the deterioration tests, not be- 
cause of inability to recognize the proper associations, but because the 
time limits imposed by the test were too short to allow the recognition to 
occur. Although we cannot check this hypothesis with our data, it is 
the present author’s impression that it is correct. If the time allowed 
for recognition had been longer, some of the subjects would probably 
have improved their scores. If such were the case, however, the test 
would need revalidation on both normal and organic subjects. 
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Summary and Conclusions 


1. When the Hunt-Minnesota Test for Organic Damage was applied 
to 64 presumably normal employees of the Norwich State Hospital, 55 
per cent had T scores indicating organic pathology. 

2. The discrepancy between our results and Hunt’s original valida- 
tion results could not be explained by the fact that our data included 
cases with very high vocabularies and cases given only the short form of 
the test. 

3. Since the test produces so many “false positives,” its validity for 
diagnosing organic brain damage must be seriously questioned. 


Received September 28, 1945. 
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The Hunt-Minnesota Test for Organic Brain Damage in 
Cases of Functional Depression * 


Paul E. Meehl and Mary Jeffery 
The University of Minnesota 


Among the several tests which have been devised for the detection of 
intellectual deterioration, one of the most efficient is the Hunt-Minnesota 
Test for Organic Brain Damage (5, 6, 7, 8). 

The author developed this instrument specifically for the diagnosis of 
organic damage. While the detection and measurement of a decrement 
in intellectual function, however caused, is an important part of the 
clinical psychologist’s work, methods must eventually be developed for 
distinguishing between two kinds of deterioration. On the one hand are 
those which are secondary to “emotional-motivational” factors (e.g. in 
schizophrenia) and, on the other, those which represent the direct effect 
of organic central nervous system pathology.' Since many varieties of 
behavior disorders are characterized by a certain amount of psychological 
deficit, the psychologist will obviously be playing a more significant 
clinical role if he can make a definite contribution to differential diagnosis 
(e.g. as between “functional” and “organic” deficit) instead of merely 
reporting a deviation from the “‘ normal” or “‘optimal’’ level. 

Such an added report would carry no implication as to the ultimate 
etiology of the disorder which he has thus labeled as showing either 
“functional” or “organic” deterioration. Because even if ‘organic’ 
(endocrine, metabolic, or autonomic) factors should finally be established 
as primary causes of the development of a schizophrenia, it would still be 
possible for an observed intellectual deficit to occur as a function of 
motivation, itself dependent upon the organic factors. In cases with 
“functional” deterioration, techniques to effectively motivate the patient 
may cause him to return temporarily to his “true” level, a phenomenon 
which has often been observed by clinicians. Whereas, in the strictly 


* The authors are indebted to Dr. B. C. Schiele and Dr. A. B. Baker, Department 
of Neuropsychiatry, for their cooperation in this study. 

1As Hunt has pointed out, test results must be considered as only a part of the 
evidence required for diagnosis, and must always be interpreted in the light of data 
from all sources: ‘Deterioration test scores are thus not a final index . . . but rather 
a diagnostic and prognostic aid. The extent to which they aid diagnosis and prognosis 
depends, to a substantial degree, upon the skill and clinical acuity of the interpreting 
clinician” (8). 
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“organic” case, no such motivational improvement can make up for a 
deficit in function directly related to nerve cell destruction, such as occurs 
in senile dementia or paresis. 

In the construction of psychometric devices for detecting organic 
brain damage, therefore, a difficulty arises because deficit can reflect 
multiple causation. Consequently, in such psychometric devices, we 
must either employ tasks which shall yield no decrement for subjects who 
are merely suffering from test-anxiety, boredom, preoccupation with 
fantasy, or depressive retardation; or, if this is impossible, we must have 
means for identifying such special decrements. 

The approach suggested by the first alternative is very difficult be- 
cause possibly it might only be attained by eliminating some degree of 
test sensitivity to organic deficit. Such a loss of sensitivity might reduce 
the effectiveness of a test to a point where it would detect only an amount 
of intellectual loss so gross as to be detectable on other grounds. The 
ultimate aim of such tests must be to detect minimal amounts of damage 
so that the psychologist can contribute independent evidence of the 
presence of pathology in the same way as the serologist or roentgenologist 
can do in cases which are relatively asymptomatic. Psychological tests 
which merely detect intellectual loss in a person, known on other grounds 
to be brain damaged, do not contribute maximally to clinical work. 

There is reason to believe that this ‘maximal contribution” is possible 
because complex intellectual processes are very sensitive to even slight 
cortical disturbances; and there is already sufficient evidence that the 
Hunt-Minnesota Test has achieved the increased delicacy desired. Yet 
just here is the “difficulty” referred to. For, no sooner has the delicacy 
of the test been stepped up to a point where it can pick up small losses 
such as those in an early senile change or an undiagnosed encephalitis, 
than it is also markedly affected by the motivational and emotional 
factors which are present in other types of cases. In short the increased 
delicacy is rarely specific for the kind of decrement we wish to detect. 
This would seem to be the crucial problem confronting the clinical 
psychologist in the field of mental deterioration. 

Experience with the Hunt-Minnesota Test at the University of 
Minnesota Hospitals has demonstrated its high validity as an indicator 
of organic brain damage. Some of the data have already been published 
(7), and other validation studies are in progress. Experience, however, 
in testing cases of depression aroused a doubt that the specificity of the 
device for organic damage was as great as had originally been hoped. 
This feeling was first expressed by Hunt, himself, in this journal (8) when 
he wrote: “In the development of the Hunt test, an attempt was made to 
provide a special means for identifying those pathologic scores attribut- 











} 
‘ 
| 
‘ 





Se hile 


EERE Fe <8 1A NIETO TORT IN 








Cy Nee es 





278 Paul E. Meehl and Mary Jeffery 


able to emotional-motivational disturbances so that the test would then 
be a specific test for the deterioration associated with brain damage. 
This attempt has been only partially successful.” 

Hunt had attempted to identify the scores that he referred to, by 
including a set of “validity’’ tests (called interpolated tests in his manual) 
such as digit span, attention, and saying the months forward and back- 
ward. His theory was that persons who are disturbed or uncooperative 
so much as to invalidate their test results would fail the interpolated 
tests, and thus the examiner would have an index for avoiding an inter- 
pretation of deterioration due to organic brain damage. As will be seen 
from study of the manual, the standard of scoring is extremely lenient; 
the criterion of invalidity being ‘failure’ on three or more of the nine 
interpolated tests. 

As it stands, the Hunt test showed results that were gratifying with 
the majority of cases at the University Psychopathic Unit. However, 
with some persons showing anxiety and depression, high deterioration 
scores were obtained without other evidence of organic brain damage. 
Most of these patients seemed quite capable of cooperating as judged by 
the interpolated tests, usually passing them by a wide margin. To 
corroborate this clinical impression the present study was undertaken. 

It is important that investigations of this sort should avoid uninten- 
tionally including subjects already suffering from minimal organic 
damage. The mere absence of a diagnosis of pathology cannot be taken 
as proof of normality, without a systematic check in the form of careful 
history taking and neurological examination. Even using neurology as 
the criterion, it is unfortunately true that, among the so-called “false 
positives,’ an unknown number of persons are actually correctly “‘posi- 
tive.””. However, all one can do is to include only cases which have been 
neurologically studied and are negative, and then to make the assumption 
that only a small minority of the group (in the absence of other evidence) 
have any minimal damage over and above that due to age, for which the 
Hunt test presumably supplies an adequate correction. 

Originally it was intended to obtain retests upon all cases, initially 
tested during a state of depression, following recovery. This plan was 
abandoned for several reasons. First of all, research by Arkola (1) 
indicated the existence of a practice effect of some magnitude. This was 
apparent even after the lapse of considerably more time taan would have 
passed before “‘recovery” in our cases. Secondly, the great majority of 
depressed patients were treated with electroshock therapy which, in 
itself, may result in unknown amounts of minimal brain damage. The 
result, then, would have been a combined effect of three variables; two 
of them (recovery and practice) would tend to lower the T-score by an 
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indeterminate amount, and the other (shock therapy) would tend possibly 
to raise it. Accordingly, this scheme of investigation had to be aban- 
doned. * 

In choice of subjects, several restrictions were necessary. The age 
limits of 20 to 55 years, for which Hunt claims maximum effectiveness 
for his test, were imposed. It was required that there be no hint of 
organic findings or history of shock therapy of any kind during previous 
episodes. Over a period of eleven months, despite a large number of 
patients “considered” as subjects, only seventeen subjects fulfilled our 
requirements. Of these, two were later eliminated because such sug- 
gestive signs as slight retinal arteriosclerosis or markedly elevated blood 
pressure appeared during subsequent neurological study. That the final 
group of fifteen cases of clearly “functional’’ depression is small, reflects 
the extreme care with which cases were selected. The findings, however, 
are so clear-cut and the Hunt test is being used so extensively that the 
writers feel further delay in reporting results is ill-advised. Dr. Hunt 
concurs in this opinion. 


The Group 


The group studied consists of all in-patient cases with prominent 
symptoms of depression admitted to the Psychopathic Unit of the Uni- 
versity Hospitals from October 1944 through November 1945. Of those 
who met the required conditions, there were 13 females and 2 males, all 
between the ages of 34 and 55. The median age was 50 years, with a 
mean‘age of 48.7 and S.D. of 6.4 years. Education varied from 7th 
grade through two years of college, the mean education attained being 
10th grade with a 8.D. of 2.4 years. Vocabulary level, on the Stanford 
Binet list, varied from 15 words (M.A. about 13 years) to 29 words (not 
quite Superior Adult III), with a mean of 22.7 words (Superior Adult I). 
The ¢ of this mean, from a hypothetical supply mean of 20 words, is 2.086 
which lies between the 5% and 10% levels of probability. 

All of the patients tested had previously received thorough physical 
and neurological examinations as well as routine laboratory studies. In 
each case, these were all negative, and no case had a history of head 
trauma, addiction, or encephalitis. 

One patient had a blood pressure of 170/100 but was included because 
her chart gave three much lower readings for examinations of about 18 
months previous. She showed no evidence of cerebral arteriosclerotic 
changes, neurologically or opthalmoscopically. 

Nine cases were entirely unsedated when tested, and the remaining 
six were under sedation with either phenobarbital (144 grains), sodium 
amytal (3 grains), nembutal (14% grains), or seconal (3 grains). An 
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unpublished study by Arkola (1) has shown that this amount of sedation 
with the barbiturates does not produce measurable effects upon Hunt 
scores, even when administered by injection, and tested at the peak of 
the sedative effect. Most of the present cases were tested several hours 
after the oral administration of the sedatives. Furthermore, the mean 
T-score of the six sedated patients is 66.8 whereas that of the nine un- 
sedated ones is 72.4 (medians 67.5 and 75 respectively). Consequently, 
it seems safe to assume that these slight degrees of sedation cannot by 
any means account for the elevations to be reported below. 

The staff diagnoses of the fifteen cases were as follows: Involutional 
melancholia, 5; psychoneurosis, reactive depressive, 4; manic depressive 
psychosis, depressed, 2; and one each of involutional psychosis, depressed 
and paranoid; manic depressive psychosis, mixed (agitated); psychoneu- 
rosis, mixed (reactive depressive and psychaesthenia) ; and psychoneuro- 
sis, anxiety state. 

The mean Multiphasic Personality Inventory profile for these 15 cases 
was as follows: ? 50.8, L 56.5, F 58.8, Hs 66.6, D 86.3, Hy 72.8, Pd 69.7, 
Pa 72.1, Pt 70.7, Sc 67.4, Ma 52.6, Mf 55.7. In 10 of the cases the 
depression score (D) was the peak of the profile, and in 11 cases it was 
above 70. Among the four cases in which D was less than 70, two 
showed T-scores of 63 on the “lie” scale (L). However, one of these 
cases was not tested with the Multiphasic until some 55 days after ad- 
ministration of the Hunt, at a time when her psychiatric condition had 
improved considerably. The median time, elapsing between the ad- 
ministrations of the Hunt and of the Multiphasic was three days, although 
in two cases an interval of over eight days had elapsed between the 
administration of the two tests. 

It should be pointed out that although all of these patients were 
depressed in varying amounts, many of them were at some stage of 
improvement when tested. No patient was tested whose momentary 
psychiatric condition was such as to preclude his at least claiming ability 
to cooperate, and apparently doing so. This will be more evident when 
we later consider the results obtained on the nine interpolated tests. 

One case, called ‘‘anxiety state” and lacking the word “depression” 
in her diagnosis, was included because depression, crying, weakness, and 
insomnia were prominent in her complaints, and because her most 
marked elevation on the MMPI was on the Depression acale (T-score 
= 98). 

The testing procedure was that described by Hunt in his manual; 
however, the special urging and explanation required to secure adequate 
cooperation was possibly more than would be employed routinely. But 
no actual “coaching” or allowance of leeway in time limits occurred. As 








ww we 


ww or 


Hunt-Minnesota Test for Organic Brain Damage 281 


was suggested by the author, the “long form” of the Hunt test was ad- 
ministered. A brief, semi-standardized interview was used following the 
Hunt test in an attempt to form some impression of the more qualitative 
aspects of the patient’s response to the test situation. The implications 
of these responses will be discussed below. 

The testing was done more or less alternately by the authors, but, 
due to special circumstances, nine cases were tested by one author and 
six by the other. Since the mean T-score of these two sets of cases do 
not differ significantly ( P > .20), all of the data have been combined for 
interpretation. 


Results 


The long-form T-scores of these 15 functionally depressed patients 
were as follows, in order of magnitude: 88, 87, 87, 87, 83, 75, 74, 73, 69, 
68, 65, 62, 55, 44, and 36. The mean of these scores is 70.2 and the 
median, 73. The sample SD is 15.41 and the best estimate of the supply 
variability is 15.95 T-score units. Even with a sample this small, it is 
quite evident that the central tendency of T-scores for depressed patients 
is considerably above that of the supply mean (of 50) used in interpreta- 
tion of scores. 

Testing the hypothesis that such a sample could have arisen from a 
population with parameter mean of 50, the Student ¢ is 4.906 which, with 
14 df., is highly significant (P < .0002). We may conclude with con- 
fidence, therefore, that the scores of depressed persons cannot be evaluated 
on the basis of a non-brain-damaged supply mean of 50 T-score. 

The obtained estimate of the SD is 15.954, about half again as large 
as the norm sigma of 10 points. Making use of the fact that the ratio 
of a sample variance to the supply variance is distributed as x?/n, we 
find a x * of 35.604 which, with 14 d-f., is again highly significant (P < 
.008). It is clear, then, that neither the mean nor the variability of the 
depressed population can be assumed to be the same as those of the norms. 

The confidence belt for the mean (using ¢) extends down to a T-score 
of 61.37, using the 5% level of confidence. On the basis of our obtained 
sample, we may therefore say that the “true” mean of depressed patients 
is almost certainly not less than about 61, i.e., a full standard deviation 
above the mean of the general population norms. A similar application 
of the x? distribution indicates that, at the 5% level of confidence, the 
“true” SD cannot reasonably be assumed to be less than 12.26 T-score 
points. 

With only 15 cases it was not practicable nor legitimate to make a 
normal curve fit and test for normality. However, the w test of Geary 
(10), employing the ratio of the MD to the SD, was done since it is quite 
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exact even for this small a sample. The MD of these cases is 12.32, 
which bears a ratio of .800 to the sample SD. This is almost precisely 
the mean of the sampling distribution of w, and there is no reason for 
assuming that the distribution of scores in the supply is abnormal. 

When this approximating assumption has been made, the question 
arises: How many depressed patients may be expected to show T-scores 
above the “critical line” of 70? If the sample mean is taken as the best 
estimate, it is apparent that about half of all cases may be expected to 
show such spuriously “‘organic”’ scores. 

Or, more generously, the extreme (most favorable) limits of tne con- 
fidence belt for the mean and sigma of the supply may be taken. That 
is, if it is assumed that the true mean is as low as 61.368, as indicated 
above (a véry improbable sampling error), and that the true standard 
deviation is as small as 12.260, the critical score of 70 is about .704 
standard deviations above the mean in such a population distribution. 
On the assumption of normality, this implies that about 24%, or nearly 
one in four, depressed persons can be expected to have “pathological”’ 
T-scores. 

If, as suggested by Hunt in his second article (6), the critical score of 
66 were used, the line would be set at .378 sigma above the hypothesized 
supply mean and therefore 35%, or about one in three, depressed cases 
would show a “pathological” result. Inspection of the distribution and 
the mean-median relationship would suggest that, to the extent the as- 
sumption of supply normality does not hold, it is because of negative 
skewness, possibly due to the presence of the rare depressed person whose 
emotional state leaves his motor and cognitive functions relatively intact. 
Such a skewness would of course make the proportion of spuriously - 
deteriorated scores even higher. 

In summary of these analyses, it is clear that the present sample 
makes it practically certain that the elevations of T-score in depressed 
persons cannot be evaluated in terms of the published norms if the 
desired interpretation, that of deterioration due to organic brain damage, 
is to be made. At the very best, we see that about one in four function- 
ally depressed patients will show scores above the critical line of 70, or 
about one in three using the score Hunt advises. A much more plausible 
estimate in terms of the sample statistics is, of course, that about half of 
the patients will show such elevated scores. 

How well do the interpolated tests function in their purpose of detect- 
ing such spuriously “pathological” cases? Of the entire group of 15 
depressed cases, only one case failed as many as three interpolated tests, 
Hunt’s criterion that the test is invalid. Indeed, only four of the present 
group failed any of the nine interpolated tests; and inspection of the 
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protocols shows, additionally, that the great majority of the cases were 
even far removed from the ‘“‘danger line” on any interpolated test. For 
those four cases who failed one or more interpolated tests, the T-scores 
were 87, 87, 83 and 73. The one patient whose test would have been 
identified as invalid on the basis of interpolated test scores (with a failure 
of six out of nine) had a T-score of 73. 

Arbitrarily, rough weights were assigned to the scores on each inter- 
polated test, and the weights for all nine interpolated tests were summed 
for each patient. There was no significant relation in our sample between 
this quantity and the size of T-score (r = .15, P > .50). 

Whether or not the scoring on the interpolated tests could be made 
more rigorous as a method of solving the present problem cannot be 
determined from our data. But the good performance of most of the 
cases and the lack of correlation with T-score, suggests that such a 
strategem might not work. In order for the majority of functionally 
depressed patients to fail them, the scoring of the interpolated tests 
would have to be so rigorous that they probably would begin identifying 
cases of actual deterioration as “invalidly tested.’”” This seems very 
likely since these tests have already been used with some success as 
indicators of deterioration by Babcock (2) and others. However, such a 
possibility would need to be explored further. 

It should be noted that the examiners, on the basis of their previous 
clinical experience with the Hunt test as used with depressed cases, were 
able to supplement the interpolated tests in assessing the validity of each 
test. The test, then, did not “miss” diagnostically as often as the 
statistics would indicate. 

Any reasonably competent clinician would, of course, use his judg- 
ment in cases where the psychiatric condition of the patient made in- 
validity a serious possibility. The examiners, however, would not have 
been able to distinguish the spuriously high scores adequately here, even 
though probably influenced by test performance. Before actually 
scoring the test, each examiner made a rating as to the apparent validity 
of the testing, trying to exclude estimates of the quantitative results; and 
to judge, both in terms of the performance as it appeared qualitatively, 
and in terms of results from the short, post-testing interview. These 
ratings fell into three categories, namely: probably valid (6 cases), 
doubtful (4 cases), and probably invalid (5 cases). Dividing the 15 
T-scores into three categories from high to low in the same proportions, 
a chi-square test on the resulting nine-fold table was not significant 
(x ? = 5.379, 4 d.f., P > .20). 

The results of a short semi-standardized interview, following the ad- 
ministration of the Hunt test, might be discussed briefly. Answers to 
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the question: ‘‘How did you like taking this test?” were rated jointly by 
the authors, independently of knowledge of test-scores; three categories 
(favorable, neutral, unfavorable to the test) were used. A chi-square 
between these ratings and the size of the T-score (9-fold table) was 16.35, 
which with 4 df. is significant at the 1% level. The contingency co- 
efficient based upon this x ? is .724, indicating some relationship between 
how badly the patient performed and his own emotional reaction of 
disfavor toward the test situation. 

An arbitrary weighting of a check list for emotional responses (crying, 
trembling, etc.) shown by the patient together with subjective judgments 
by the examiner as to the patient’s degree of retardation, motivation, 
etc., correlated .45 with the T-score which the patient obtained on the 
Hunt test. That is, high scores were associated with a higher degree of 
emotional disturbance. With only 15 cases, this correlation lies between 
the 5% and 10% levels using Fisher’s ¢. 

Between magnitude of T-score and score on the Minnesota Multi- 
phasic Depression scale, there was an insignificant association (r = —.13, 
P > .60). 

The four highest T-scores are those of psychotics, but so are the two 
lowest. On the whole, the diagnoses seem to be scattered randomly 
among the test scores. The mean score for the nine cases of psychosis 
was 72.8, and that for the six psychoneurotic cases was 66.3, a difference 
which is quite insignificant statistically (P > .40). Breaking the set of 
T-scores into “High” and “Low” and then obtaining a chi-square on the 
resulting fourfold table, again shows an insignificant association between 
severity of T-score on the Hunt test and diagnosis (x? = .028, 1 d-f., 
P > .80). It should be recalled, however, that the numbers here become 
so very small that quite possibly the study of larger groups, of psychotics 
compared with neurotics, would yield a difference. 

From these various findings, we may tentatively, with suitable caution 
because of thesmall sample, conclude that examiner judgments of validity, 
amount of upset shown by the patient, diagnosis of psychosis or neurosis, 
or a measure of depression such as that of the Multiphasic Depression 
scale, would not enable one to separate valid from invalid testings. It 
would seem, then, that the best approach is to either avoid giving the test 
to depressed patients at all, or look upon its results in such cases as 
indicators of loss in intellectual efficiency without implication of under- 
lying organic pathology. 

With a sample this small, such correlations mean little, but it may be 
worth while to indicate such trends as the relative absence of relation 
between the T-score and certain other variables. Since the T-score is 
based upon a deviation from the multiple regression plane (learning score 


Hunt-Minnesota Test for Organic Brain Damage 285 


regressed upon age and vocabulary), one would not expect any relation 
to exist here. Correlation of T-score with age is insignificant (r = —.22, 
P > .40) as is that with vocabulary (r = —.14, P > .50). The correla- 
tion of T-score with maximum grade reached in school is also insignificant 
(r = —.30, P > .20). 


Qualitative Observations 


When questioned, the majority of the patients stated that they felt 
they could have performed better had they been tested before they became 
ill. And, it was observed that a number of them showed overt signs of 
upset such as crying, tremor, and peculiarities of voice and speaking 
rate. A few expressed a lack of interest in the proceedings, as would be 
expected in depressed persons. However, all were sufficiently coopera- 
tive to be willing apparently to attend to the test material; only two were 
inclined to admit that they were not really trying very hard. 

The explanations patients gave of poor performance varied—that 
their thoughts tended to be on other things, that they felt too sad to care 
about the test, and in some cases that they were really trying to make a 
good showing but simply could not remember adequately. It was not 
possible, from either the quantitative or the qualitative data at hand, to 
form any clear hypotheses as to the manner in which depression inter- 
feres with the intellectual output. 

However, it is likely that the simple fact of retardation could lead to a 
considerable eievation of T-scores, considering the rather split-second 
timing which the Hunt test employs. Preoccupation with “other things” 
is, of course, a possibility; but few would admit to this and, indeed, the 
examiners’ impression is that this was not a very real factor, considering 
the more-than-adequate performance of the great majority of the cases 
when taking the interpolated tests. 

Considering the foregoing, the writers are convinced that, on the basis 
of the subject’s behavior in the testing situation, the examiner cannot 
adequately judge whether psychiatric upset is seriously impairing 
validity. 


Summary 


The Hunt-Minnesota Test for Organic Brain Damage was admin- 
istered to a group of 15 persons with functional depressions, of whom 
nine were psychotic and six neurotic. All of these cases were between 
the ages of 34 and 55 years, and were neurologically and serologically 
negative for organic brain damage. None of them had a history of 
alcohol or drug addiction, head trauma, or encephalitis. All were co- 
operative to the extent of being willing to take the test and to apparently 








: 
i 
1 


PTE RL ORCL 


' 





286 Paul E. Meehl and Mary Jeffery 


pay attention to the stimulus materials. Only one of the 15 was dis- 
turbed so greatly as to fail as many as three of the interpolated tests, and 
11 subjects did not fail any of them. The findings were: 


1. The mean T-score of the entire group was 70.2, with a SD of 15.41 
points. Both the mean and the standard deviation differ significantly 
from a hypothetical supply with a mean of 50 and a SD of 10. 

2. By the setting up of confidence belts for the estimation of popula- 
tion mean and variance, it is shown that at the very least, one can expect 
about one in four functionally depressed patients to have ‘‘pathologital”’ 
scores (T > 70); or, setting the critical score at 66, about one in three 
patients. 

3. The best estimate is that about half of functionally depressed 
patients may be expected to show scores over 70 on the Hunt test. 

4. It is not possible, from the external manifestations of the patient’s 
emotional disturbance, for the examiner to separate “‘valid” from “‘in- 
valid” testings. 

5. It is concluded that the Hunt-Minnesota Test for Organic Brain 
Damage, as it now stands, is not entirely specific for organic brain damage. 
Significant scores on this test obtained upon cases with depression as an 
important component of their illness cannot be interpreted except as a 
decrement in intellectual function of undetermined etiology. 


It would be a mistake to extend this interpretation to the test scores 
of all patients with functional disorders, however, for over half of Hunt’s 
original standardization group was composed of such cases. The mere 
presence of psychiatric involvement, as in a severe psychoneurosis, is by 
no means sufficient to invalidate the results, as will be shown by data 
soon to be published. However, examiners should interpret with caution 
a significant Hunt score which is obtained on a patient depressed to a 
considerable degree. 


Received February 9, 1946. 
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Book Reviews 


Smith, May. Handbook of industrial psychology. New York: Philo- 
sophical Library, 1944. Pp. 304. $5.00. 


The preface states that “this little book is not intended to be a de- 
tailed chronicle of psychology from the industrial standpoint, but to 
provide an introduction to the subject for those who are in some way 
responsible for others, or who have to get on with others.” 

Neither psychologist nor lay reader would feel after reading this book 
that his supervision or understanding of workers had been improved. 
American psychologists will be left wondering whether industrial psy- 
chology in England lags far behind that in America or whether the author 
has inadequately presented work accomplished in her country. Lay 
readers who note the author’s defensiveness in the preface with regard to 
general acceptance of industrial psychological research, are likely to con- 
clude that the fault lies with psychologists for having devoted the major 
part of their time to insignificant problems. 

The opening chapter gives a historical survey of work preceding 
modern industrial psychology, and similar material is found throughout 
the entire book. Much of this material will be new and interesting to 
many readers, who may wonder, however, at the fragmentary nature of 
modern work as compared with shrewd observations made several cen- 
turies ago. Considerable emphasis is placed on fatigue and environ- 
mental conditions such as light, temperature, noise, and hours of work. 
In this respect the book deviates from the current American trend away 
from sensory aspects of industrial psychology. 

References to American work are conspicuously few. In the field of 
motion and time studies the work of Ralph Barnes, the Gilbreths, and 
F. W. Taylor is mentioned. Other references identified by the author 
or recognized by the reviewer as American are limited to brief mention- 
ings of work by V. V. Anderson, J. Goldmark, Elton Mayo, Munsterberg, 
Roethlisberger and Dickson, and the National Fesearch Council. Nu- 
merous references are for the period during and immediately after World 
War I and may, consequently, give lay readers the impression that little 
advance has been made during the past ten or twenty years. The ref- 
erences provide, however, an excellent list of the publications of the 
Industrial Research Board as well as other British studies; all of which 
are too little known in America. 

Lack of organization is obvious in the author’s theories and classi- 
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fications as well as in the plan of the book asa whole. There is no index, 
which fact makes it difficult for the reader to gather together all material 
on a given subject. Many excellent observations and insights appear in 
the book, but they are scattered and given no emphasis. Interspersed 
with these are numerous platitudes. 

As a whole the book will be disappointing to American psychologists, 
and the field of industrial psychology will be disappointing to lay readers 
who judge the field by the book. 

Clifford E. Jurgensen 


Minneapolis Gas Light Company, 
Minneapolis, Minnesota 


Practical handbook for counselors. Chicago: Science Research Associates, 
1945. Pp. 160. $1.50. 


This handbook is directed primarily at counselors in secondary schools. 

Handbooks, if encouraged, could easily become substitutes for sound 
training for counselors. This would be unfortunate because they are of 
necessity superficial and skimpy. There are many examples of this in 
the present handbook. On page 54, only three paragraphs are used for 
the discussion of the Technique of the Interview. In Chapter 5, is ‘“‘an 
annotated list” of tests. Not only is the list very limited but also biased 
as may be seen in the contrasting annotations for the Kuder and the 
Strong, page 44. Furthermore, in regard to the Kuder the following 
erroneous statement occurs. ‘This inventory measures an individual’s 
interests in nine occupational areas. .. .” Does it? The reviewer has 
never seen any evidence to indicate that this is true. 

If a person wants a superficial survey of the field of counseling with 
no intention of practicing counseling, this booklet would be of some use; 
but the reviewer believes it is unwise to put such a device into the hands 
of a naive person who intends to do counseling. Such a person might 
conclude that he could do counseling. 

L. E. Drake 


University of Wisconsin 


Lazarsfeld, P. F., Berelson, B., & Gaudet, H. The people’s choice. How 
the voter makes up his mind in a presidential campaign. New York: 
Duell, Sloan & Pearce, 1944. Pp. 178. 


For the most part, this report analyzes the results of repeated ques- 
tioning of a panel of 600 respondents in Erie County, Ohio, in relation to 
the 1940 (Roosevelt-Willkie) presidential election. Results and inter- 
pretation are stressed in this report while the most important methodo- 
logical problems are omitted for treatment in separate reports. The 
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background material includes a description of the county, a summary of 
political events, and an analysis of the influences operating during the 
campaign. 

The survey itself collected data on voting intentions, expectations, 
exposure to propaganda, and the usual information on respondents’ 
characteristics—to mention only a few of the more important subjects 
covered by the interviews. With the use of a panel technique, changes in 
voting intentions were followed closely and the reasons for these changes 
were analyzed. 

Any attempt to summarize the results or explain the details, within 
limited space, would do the study an injustice. The interpretations and 
conclusions cover a broad field all the way from the contention that there 
is a bandwagon effect to the discovery that “‘in-laws’’ agree less than other 
family members in presidential preference. 

The progress made by this study is striking. Its addition to our 
knowledge of the voter is enough to justify the study beyond a doubt. 
Psychologists; sociologists, political scientists, and others with an aca- 
demic interest in political behavior will find the results valuable. Even 
practical politicians may find a few applications, but they will have to 
dig them out themselves: the report includes very little interpretation 
from the standpoint of practical politics. 

One reason for this productivity is the use of the panel technique but 
the reviewer thinks there are other reasons as well: (1) ingenuity in 
devising hypotheses to be tested, (2) cleverness in developing tests of the 
hypotheses, and (3) skillful use of breakdowns. 

When a study tackles a difficult practical problem, limitations are to 
be expected; and their presence is not a reflection on the quality of the 
research. They are listed in this review merely as problems. 


1. Many of the most important analyses are based upon sub-groups, 
and some of these sub-groups are quite small. 

2. It is difficult to tell to what extent the findings in a single county 
would hold up if tested by a more adequate sample. Presumably the 
trends, relationships, and processes would be fairly uniform in all geo- 
graphic areas, but even these can be affected by factors which do vary in 
different areas. 

3. People who change political preference definitely within a short 
period constitute a relatively small proportion of the total. Thus the 
group that is most productive for the study of changes is not large enough 
to stand much further subdivision for purposes of analysis. 


One omission is relatively unimportant as far as this report is con- 
cerned; but it may be significant in relation to other problems. One 
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reason given for the selection of Erie County was that for forty years 
preceding 1940 it had reflected national voting trends quite accurately. 
Naturally many readers will wonder whether Erie County continued to 
reflect the national trend for the election covered by this study; but the 
report is silent on this point. 

This study has laid the groundwork for similar surveys. An excellent 
job of ice-breaking has been done. The reviewer hopes that repeating 
this survey for subsequent elections will be possible. 

Alfred C. Welch 


Knox Reeves Advertising, Inc., 
Minneapolis, Minnesota 


Chamberlin, Dean, Chamberlin, Enid, Drought, Neal E. and Scott, 
William E. Did They Succeed in College? A Follow-up Study of the 
Graduates of the Thirty Schools. New York: Harper and Bros. 
1942. Pp. 291. $2.50. 


This investigation concerns the successful adjustments in college of a 
large number of graduates of the thirty high schools participating in the 
well-known eight-year study of secondary education. The high schools, 
it will be remembered, were freed, by agreement with colleges, from re- 
quiring that students enroll in the traditionally required college prepara- 
tory subjects. This experiment is, therefore, a test of whether students 
can succeed without the traditional preparatory subjects, having studied 
the then-called “‘progressive” subjects. The colleges involved in the 
study include almost all types and the curricula chosen by the students 
covered almost the full range available to them. A “comparison” group 
of control students enrolled in the same colleges were selected and 
“matched” with respect to comparable scholastic aptitude scores, sex, 
race, age, religious affiliation, size and type of secondary school, home 
community, socio-economic status of family, extra-curricular activities 
in high school, vocational objective and other such factors. These and 
other data were collected from the colleges’ admissions forms and directly 
from the students themselves. In all, 3583 students—1826 men and 1757 
women—were studied, but only 1475 were matched with control students 
and studied intensively. 

Rather extensive analyses are presented of the adjustments of these 
1475 experimental and 1475 control students in colleges, those enrolling 
in 1936 being studied for four years. The graduates of the thirty schools 
earned grades which were slightly but consistently higher than those of 
the comparison group (p. 24—). This slight superiority was found in all 
subjects except foreign languages. In general, the experimental students 
selected the same types of college subjects for specialization as did the 
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comparison group. No marked differences between the two groups were 
found in the number or percentage placed on scholastic probation because 
of low grades or in the number of scholastic honors received except in the 
highest level of aptitude where ten per cent more experimental than 
control students received honors. This latter point deserves further 
emphasis because of the claim of the progressive educators that the pro- 
gram removed many of the restrictions and hindrances to learning often 
operating in the cases of very able students. Numerous additional 
analyses are presented with respect to comparisons of the two groups in 
scholastic, personal, social and emotional developments and adjustments 
in the colleges. In general, the results are rather consistently favorable 
to the experimental students. 

This reviewer will not criticize certain minor faults of this significant 
experiment. It is, however, to be regretted that the experimenters did 
not find it possible to experiment further, or did not see the possibilities 
of further experimentation, to make their findings more acceptable to 
those college educators who still doubt that high schools are capable of 
determining what are satisfactory instructional materials. A smaller 
experimental and control group could have been selected for testing with 
the comprehensive achievement examinations developed by the staff of 
the Eight-Year Study to measure the special and detailed outcomes of the 
progressive curriculum. Then still other paired groups could have been 
selected for testing with standardized achievement tests, such as those 
produced by the Cooperative Test Service and also those of the Iowa 
Every Pupil Testing Program. These two supplementary experiments 
would have further tested the relative contributions of progressive cur- 
ricula, and their opposites, in high schools. 

The experiment reported in this book is a classic one, although, for the 
most part, college faculties and directors of admissions have not rushed 
to reform their entrance requirements. Sad as it is to report the fact, 
we must state that only a few cracks have appeared in the college ad- 
missions fagade. Faculties continue to require prerequisite subject 
matter for admission as a freshman and to subsequent advanced courses. 
The whole basis for prescribing such prerequisite subject matter is dis- 
credited by such experiments as this one. But little change in practice 
is observed, and the recent Harvard report on general education might 
well have been written without knowledge, or concern, for the experi- 
mental findings of such studies as this one. Obiter dicta, not experimenta- 
tion, continues to be the modus operandus for determination of educa- 
tional policies. We shall need to accumulate current evidence of the 
college adjustments of veteran-students admitted without traditional 
prerequisites to prove once again that: (1) exposure to, or lack of expos- 
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ure to, learning opportunities and (2) legislating admissions requirements 
on the basis of non-experimertal evidence are not sound ways of deter- 
mining who is eligible for and destined to become a successful college 
student. 


E. G. Williamson 


University of Minnesota 


Wells, F. L. and Ruesch, Jurgen. Mental Examiner’s Handbook, Revised 
Edition. New York: Psychological Corporation, 1945. Pp. vii + 
211. $4.50. 


This is a revision of the Handbook published in 1942 which has been 
found very useful in psychiatric examination. It should be stressed that 
the Handbook is intended primarily for aid in the examination of “psy- 
chiatric” patients, that is to say, patients whose behavior deviates from 
the normal much more than is found in the practice of most readers of the 
Journal of Applied Psychology. 

It is divided into a section on so called “clinical” aspects, which 
merely attempts to list and more or less objectify the usual psychiatric 
examination of patients; and a section dealing with somewhat more 
standardized “tests” of mental functioning such as vocabulary, word 
association, proverb interpretation, the Kent E-G-Y questions, and so 
forth. The present edition is an improvement over the previous one, 
especially with respect to the presence of norms. It is particularly useful 
in the training of medical students just beginning the study of neuropsy- 
chiatry, furnishing a more or less objective set of tasks by which they may 
assess the patient’s functioning without actually requiring a thorough 
study of psychometrics. In some respects the work might be criticized 
for lending an impression that certain determinations are relatively 
simple and easy, such as the listing of Murray’s “needs” in the same way 
as one lists the varieties of orientation. Some psychologists might ob- 
ject to the giving of “mental age” values on certain of the tests for 
similar reasons. On the whole, this Handbook fills an important place 
between rigorous psychometrics and all that implies, and the almost 
wholly unstandardized and subjective “mental status” examination of 
clinical psychiatry. 


Paul E. Meehl 
University of Minnesota 


Sachs, H. Freud: Master and Friend. Cambridge: Harvard Univ. 


Press, 1944, Pp. 195. $2.50. 


This is an enjoyable book. It is light and informative, partly bio- 
graphical, partly autobiographical. Little insights into the workings of 
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the group of people surrounding Freud are liberally interspersed, and it 
furnishes an excellent description of the social atmosphere in which Freud 
lived and initiated psychoanalysis. 

This book is not an important one for personnel directors, but will be 
useful reading for the majority of those psychologists who are interested 
in psychoanalysis. 


K. W. Oberlin 
Western Electric Company 





New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be 
sent to Donald G. Paterson, Editor, Department of Psychology, 
University of Minnesota, Minneapolis 14, Minnesota 


Psychoanalytic therapy, principles and application. Franz Alexander, 
Thomas Morton French, et al. New York: The Ronald Press Co., 
1946. Pp. 353. $5.00. 

Employment tests in industry and business: A selected annotated biblio- 
graphy. Hazel C. Benjamin. Princeton: Industrial Relations Sec- 
tion, Princeton University, 1945. Pp. 46. $.50. 

Music and sound systems in industry. Barbara Elna Benson. New 
York: The McGraw-Hill Book Co., Inc., 1946. Pp. 124. $1.50. 

The successful employee publication. Paul F. Biklen and Robert D. 
Breth. New York: The McGraw-Hill Book Co., Inc., 1946. Pp. 
179. $2.00. 

Student personnel work in the postwar college. Willard W. Blasser, et al. 
Washington, D. C.: American Council on Education, 1945. Pp. 95. 
Gratis. 

Manual of child psychology. Leonard Carmichael. New York: John 
Wiley & Sons, Inc., 1946. pp. 1496. $6.00. 

Our teen-age boys and girls. Lester D. Crow and Alice Crow. New 
York: The McGraw-Hill Book Co., Inc., 1946. Pp. 365. $3.00. 

Counseling methods for personnel workers. Annette Garrett. New York: 
Family Welfare Association of America, 1945. Pp. 187. $2.00. 

Cats in a puzzle box. Edwin R. Guthrie and George P. Horton. New 
York: Rinehart & Co., Inc., 1946. Pp. 67. $1.50. 

Twentieth century psychology. Philip L. Harriman, et al. New York: 
The Philosophical Library, Inc., 1946. Pp. 710. $6.00. 

Through a dean’s open door. Herbert E. Hawkes and Anna L. Rose 
Hawkes. New York: The McGraw-Hill Book Co., Inc., 1945. Pp. 
242. $2.50. 

Human welfare and industrial efficiency. L. 8. Hearnshaw and R. 
Winterbourn. Wellington, New Zealand: A. H. and A. W. Reed, 
1945. Pp. 169. 7s. 6d. 

Adolescence and youth: the process of maturing. Paul H. Landis. New 
York: The McGraw-Hill Book Co., Inc., 1945. Pp. 483. $3.75. 
Stone walls and men. Robert M. Lindner. New York: The Odyssey 

Press, Inc., 1946. Pp. 496. $4.00. 


295 








296 New Books, Monographs, and Pamphlets 


Principles of dynamic psychiatry. Jules H. Masserman. Philadelphia 
and London: W. B. Saunders Co., 1946. Pp. 322. $4.00. 

The social problems of an industrial civilization. Elton Mayo. Boston: 
Division of Research, Harvard Business School, 1945. Pp. xvi + 
150. $2.50. 

The neuroses in war. Emanuel Miller. New York: The Macmillan 
Co., 1945. Pp. 250. $2.50. 

Industrial training and testing. Howard K. Morgan. New York: The 
McGraw-Hill Book Co., Inc., 1946. Pp. 225. $2.50. 

How to keep a sound mind. Revised edition. John J.B. Morgan. New 
York: The Macmillan Co., 1946. Pp. 394. $3.00. 

Men at work. C. A. Oakley. London: University of London Press, 
1945. Pp. 301. 8s. 6d. 

Occupational information. Carroll L. Shartle. New York: Prentice- 
Hall, Inc., 1946. Pp. 339. $3.50. 

Propaganda, communication, and public opinion. A comprehensive refer- 
ence guide. Bruce Lannes Smith, Harold D. Lasswell, and Ralph D. 
Casey. Princeton: Princeton University Press, 1946. Pp. 435. 
$5.00. 

Psychiatry in modern warfare. E. A. Strecker and K. E. Appel. New 
York: The Macmillan Company, 1945. Pp. 88. $1.50. 

Thorndike-Century beginning dictionary. E. L. Thorndike. Chicago: 
Scott, Foresman and Co., 1945. Pp. 645. $1.60. 

Interviewing for NORC. National Opinion Research Center. Colorado: 
University of Denver, 1945. Pp. 154. $2.00. 

The Carnegie Foundation for the advancement of teaching. Fortieth 
Annual Report. New York: The Carnegie Foundation for the 
Advancement of Teaching, 1946. Pp. 130. Gratis. 
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